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Preface 



The Seventh International Conference on Principles and Practice of Constraint 
Programming (CP 2001) provided an international forum for cutting edge rese- 
arch into constraints. There were several important innovations at the conference 
this year. Most important of these were the Innovative Applications program, 
the Doctoral Program, and co-location with the 17th International Conference 
on Logic Programming (ICLP2001). 

The Innovative Applications (lA) Program showcased the very best appli- 
cations of constraint technology. It provided a forum for practitioners and end 
users, and an interface between them and academic researchers. It took over 
the task previously performed by the Conference on the Practical Application 
of Constraint Technologies and Logic Programming (PACLP). I am especially 
gratefully to Edward Tsang, who was to be Chair of PACLP 2001, for chairing 
this section of the conference. 

The second innovation, the Doctoral Program allowed PhD students to pre- 
sent their work and to receive feedback from more senior members of the com- 
munity. I am especially grateful to Francesca Rossi who chaired this section of 
the conference, and who raised enough sponsorship to support the participation 
of over two dozen students. 

This volume contains the papers accepted for presentation at CP 2001. The 
conference attracted a record number of 135 submissions. Of these, 37 papers 
were accepted for presentation in the Technical Program. A further 9 papers 
were accepted into the Innovative Applications Program. In addition, 14 papers 
were accepted as short papers and presented as posters during the Technical 
Program. 

We were privileged to have three distinguished invited speakers this year: Pe- 
ter van Beek (University of Waterloo), Eugene Freuder (Cork Constraint Com- 
putation Center), and Moshe Vardi (Rice University). We also had a large num- 
ber of workshops and tutorials organized by Thomas Schiex, the Workshop and 
Tutorial Chair. 

Finally, I would like to thank Antonis Kakas, the Local Chair who did a great 
job organizing both CP 2001 and ICLP2001. I would also like to thank again 
Edward Tsang and Francesca Rossi, as well as Thomas Schiex, and last but not 
least Ian Miguel, the Publicity Chair. 



September 2001 



Toby Walsh 
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Hybrid Benders Decomposition Algorithms 
in Constraint Logic Programming 



Andrew Eremin and Mark Wallace 



IC-Parc 
London, UK 

{a.eremin, mgw}@icparc . ic . ac .uk 



Abstract. Benders Decomposition is a form of hybridisation that al- 
lows linear programming to be combined with other kinds of algorithms. 
It extracts new constraints for one subproblem from the dual values of 
the other subproblem. This paper describes an implementation of Ben- 
ders Decomposition, in the ECLiPSe language, that enables it to be used 
within a constraint programming framework. The programmer is spared 
from having to write down the dual form of any subproblem, because 
it is derived by the system. Examples are used to show how problem 
constraints can be modelled in an undecomposed form. The programmer 
need only specify which variables belong to which subproblems, and the 
Benders Decomposition is extracted automatically. A class of minimal 
perturbation problems is used to illustrate how different kinds of algo- 
rithms can be used for the different subproblems. The implementation is 
tested on a set of minimal perturbation benchmarks, and the results are 
analysed. 



1 Introduction 

1.1 Forms of Hybridisation 

In recent years, research on combinatorial problem solving has begun to address 
real world problems which arise in industry and commerce m- These prob- 
lems are often large scale, complex, optimisation (LSCO) problems and are best 
addressed by decomposing them into multiple subproblems. The optimal solu- 
tions of the different subproblems are invariably incompatible with each other, 
so researchers are now exploring ways of solving the subproblems in a way that 
ensures the solutions are compatible with each another - i.e. globally consistent. 
This research topic belongs to the area of “hybrid algorithms” HE], but more 
specifically it addresses ways of making different solvers cooperate with each 
other. Following we shall talk about “forms of hybridisation” . 

An early form of hybridisation is the communication between global con- 
straints in constraint programming, via the finite domains of the shared variables. 
Different subproblems are handled by different global constraints (for example 
a scheduling subproblem by a cumulative constraint and a TSP subproblem by 
a cycle constraint 0), and they act independently on the different subproblems 

T. Walsh (Ed.): CP 2001, LNCS 2239, pp. l-[l5] 2001. 
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yielding domain reductions. This is a clean and sound hybridisation form because 
a domain reduction which is correct for a subproblem is necessarily correct for 
any larger problem in which the subproblem is contained. 



1.2 Hybridisation Forms for Linear Programming 

Master Problems and other Subproblems. LSCO problems involve a cost 
function, and for performance reasons it is important to find solutions quickly 
that are not only feasible but also of low cost. Usually these cost functions are 
linear, or can be approximated by a linear or piecewise linear function. Linear 
programming offers efficient constraint solvers which can quickly return optimal 
solutions to problems whose cost function and constraints can be expressed using 
only linear expressions. Consequently most industrial LSCO problems involve 
one or more linear subproblems which are addressed using linear programming 
as available in commercial products such as XPRESS jS] and CPLEX |9]. 

Whilst global constraints classically return information excluding certain as- 
signments from any possible solution, linear solvers classically return just a single 
optimal solution. In contrast with global constraints, the information returned 
by a linear solver for a subproblem does not necessarily remain true for any 
larger problem in which it is embedded. Thus linear solvers cannot easily be 
hybridised in the same way as global constraints. 

Nevertheless several hybridisation forms have been developed for linear 
solvers, based on the concept of a “master” problem, for which the optimal solu- 
tion is found, and other subproblems which interact with the master problem. In 
the simplest case this interaction is as follows. The subproblem examines the last 
optimal solution produced for the master problem, and determines whether this 
solution violates any of the constraints of the subproblem. If so the subproblem 
returns to the master problem one or more alternative linear constraints which 
could be added to the master problem to prevent this violation occurring again. 
One of these constraints is added to the master problem and a new optimal 
solution is found. To prove global optimality each of the alternatives are added 
to the master problem on different branches of a search tree. These alternatives 
should cover all possible ways of fixing the violation. 

A generalisation of this form of hybridisation is “row generation” |T0], where 
a new set of constraints ( “rows” ) are added to the master problem at each node 
of the search tree. Unimodular probing m is an integration of a form of row 
generation into constraint programming. 



Column Generation. Another form of hybridisation for linear programming 
is column generation m- In this case the master problem is to find the op- 
timal combination of “pieces” where each piece is itself a solution of another 
subproblem. A typical application of column generation is to crew scheduling: 
the assignment of crew to a bus or flight schedule over a day or a month. There 
are complex constraints on the sequence of activities that can be undertaken by 
a single crew, and these constraints are handled in a subproblem whose solutions 
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are complete tours which can be covered by a single crew over the time period. 
The master problem is the optimal combination of such tours. The master prob- 
lem constraints enforce that each scheduled bus trip or flight must belong to 
one tour. Each tour is represented in the master problem by a variable, which 
corresponds to a column in the matrix representing the problem. 

In the general case, each call to another subproblem returns a solution which 
has the potential to improve on the current optimum for the master problem. 
Each call to a subproblem adds a column to the master problem, and hence the 
name “column generation” . 

A number of applications of column generation have been reported in which 
the subproblem is solved by constraint programming [T3FT41 - A column genera- 
tion library has been implemented in the ECLiPSe constraint logic programming 
system, which allows both subproblem, communication of solutions and search 
to be specified and controlled from the constraint program. 

While column generation utilises the dual values returned from convex solvers 
to form the optimisation function of a subproblem, a closely related technique 
exploits them to approximate subproblem constraints within the optimisation 
function of the master problem. This technique is known as Lagrangian relax- 
ation and has been used for hybridising constraint programming and convex 
optimisation by Sellmann and Fahle and Benoist et. al. m in m 



Other Hybridisation Forms. Besides optimal solutions, linear solvers can 
return several kinds of information about the solution. Reduced costs are the 
changes in the cost which would result from changes in the values of specific 
variables. These are, in fact, underestimates so if the reduced cost is “-10” the 
actual increase in cost will be greater than or equal to 10. In case the variable has 
finite domain, these reduced costs can be used to prune values from the domain 
in the usual style of a global constraint. (A value is pruned from the domain if 
the associated reduced cost is so bad it would produce a solution worse than the 
current optimum) . In this way linear programming can be hybridised with other 
solvers in the usual manner of constraint programming. Indeed the technique 
has been used very successfully fl8] . 

1.3 Benders Decomposition 

Benders Decomposition is a hybridisation form based on the master prob- 
lem/subproblem relationship. It makes use of an important and elegant aspect 
of mathematical programming, the dual problem m- Benders Decomposition 
is applicable when some of the constraints and part of the optimisation function 
exhibit duality. The master problem need not use mathematical programming at 
all. The subproblems return information which can be extracted by solving the 
dual. The new constraints that are added to the master problem are extracted 
from the dual values of the subproblems. 

We have implemented Benders Decomposition in ECLiPSe and used it to 
tackle several commercial applications in transportation and telecommunica- 
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tions. The technique has proved very successful and has outperformed all other 
hybridisation forms in these applications. 

For the purposes of this paper we have also used Benders Decomposition to 
tackle a set of benchmarks originally designed to test another hybridisation form, 
Unimodular Probing HU. Whilst our results on these benchmarks have not been 
so striking as the applications mentioned above, they nicely illustrate the use 
of Benders Decomposition and the combination of linear programming with a 
simple propagation algorithm for the master problem. From these benchmarks 
we also make some observations about the kinds of problems and decompositions 
that are most suited to the hybrid form of Benders Decomposition. 



1.4 Contents 

In the following section we introduce Benders Decomposition, explain and justify 
it, and present the generic Benders Decomposition algorithm. In section 3 we 
show how it is embedded in constraint programming. We describe the user inter- 
face, and how one models a problem to use Benders Decomposition in ECLiPSe. 
We also describe how it is implemented in ECLiPSe. In section 4 we present the 
application of Benders Decomposition to a “minimal perturbation” problem, its 
definition, explanation and results on a set of benchmarks. Section 5 concludes 
and discusses the next application, further work on modeling and integration, 
and open issues. 

2 Benders Decomposition 

Benders decomposition is a cut or row generation technique for the solution of 
specially structured mixed integer linear programs that was introduced in the 
OR literature in m- Given a problem P over a set of variables V, if a subset X 
of the variables can be identified for which fixing their values results in one or 
more disconnected SubProblems (SPi) over the variable sets Yi : 
which are easily soluble — normally due to some structural property of the re- 
sulting constraints — it may be beneficial to solve the problem by a two stage 
iterative procedure. 

At each iteration k a Relaxed Master Problem (RMP*") in the complicating or 
connecting variables X is first solved and the solution assignment X = X^ used 
to construct the subproblems SP}^; these subproblems are then solved and the 
solutions used to tighten the relaxation of the master problem by introducing 
Benders Cuts, P^{X). 

The subproblems optimise over reduced dimensionality subspaces T>y of the 
original problem solution space obtained by fixing the variables X = X^ , while 
the master problem optimises over the optimal solutions of these subspaces aug- 
mented by guided by the cuts generated. 

In classical Benders Decomposition both the master and subproblems are linear 
and are solved by MILP algorithms, while the cuts are derived from Duality 
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theory. In general however, we are free to use any appropriate solution meth- 
ods for master and subproblems — all that is required is an assignment of the 
master problem variables X = to construct convex subproblems, and a pro- 
cedure for generating valid cuts from subproblem solutions. The most naive such 
scheme would merely result in the master problem enumerating all assignments 
of X, while more informative cuts can result in substantial pruning of the master 
problem search space. 



2.1 Classical Benders Decomposition 

Consider the linear program P given by: 



P : min ^c^^y; 

i—1 

subject to GiX + Aiyi > bi Vz 
X e 

yi > 0 Vz 



( 1 ) 



When X is fixed to some value x*^ we have linear programs in y^ which may be 
specially structured or easy to solve, prompting us to partition the problem as 
follows: 



P : min 



= mm 
xGX>x 



f'^x -f ^ (min (c?yi : Aiyi > bi - GiX, yi > O}) I 

. i=l J 

' I 

f'^x -h (max {ui(bi — Gjx) : Ui Ai < C;,Ui > 0}) 



( 2 ) 



i=l 



where the inner optimizations have been dualised. Given that Ui = {ui : uiAj < 
Ci, Ui > 0} is non-empty for each z either there is an extreme point optimal solu- 
tion to each inner optimization or it is unbounded along an extreme ray; letting 
, . . . , Uj ‘ and , . . . , d? ‘ be respectively the extreme points and directions of 
Ui we can rewrite (HJ as the mixed integer Master Problem MP: 



MP : min ^ = f"^x -t- ^ f3i 

i—1 

subject to Pi > u|'(bi — Gix) Vz V/c 
0>d!(bi-Gix) Vz yi 
x£T>x 



( 3 ) 



Since there will typically be very many extreme points and directions of each Ui 
and thus constraints in m we solve relaxed master problems containing a subset 
of the constraints. If for some relaxed master problem RMP*^ the optimal re- 
laxed solution (z^, x*^) satisfies all the constraints of ©,then (z'=,x'^,yk,...,yk) 
is an optimal solution of (HI); otherwise there exists some constraint or Benders 
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Cut in (O which is violated for x = x*^ which we add to RMP*^ to form RMP*^+^ 
and iterate. 

To determine such a cut or prove optimality we obtain the optimal solution 
(/3f,u!^) of the Subproblems SP|^ formed by fixing x = x*^ in d2|): 

SP|^ : max /3f = Ui(b; — GiX*^) 

subject to Ui Ai < Ci (4) 

Ui > 0 

If any subproblem SPj^ has an unbounded optimal solution for some x*^ then the 
primal of the subproblem is infeasible for x*^; if any subproblem SP?' is infeasible 
for some x*^ then it is infeasible (and the primal of the subproblem is infeasible 
or unbounded) for any x since the (empty) feasible region Ui is independent of 
X. In either case we proceed by considering the Homogeneous Dual of the primal 
of the subproblem: 

max Uj(bi — GjX*^) 

subject to Ui Ai < 0 (5) 

Ui > 0 

This problem is always feasible (ui = 0 is a solution), having an unbounded 
optimum precisely when the primal is infeasible and a finite optimal solution 
when the primal is feasible. In the unbounded case we can obtain a cut 

u|^(bi — Gix) < 0 

corresponding to an extreme direction of Ld = {ui : UiAi < 0,Ui > 0}. The 
complete Benders decomposition algorithm proceeds as follows: 

Algorithm 1 The Benders Decomposition Algorithm 

1. Initialisation step: From the original linear program P (P construct the 
relaxed master problem RMP° Q with the initial constraint set x G Dx 
and set k = 0. 

2. Iterative step: From the current relaxed master problem RMP*^ with optimal 
solution (z^,x*^) construct RMpk+l 

with optimal solution (z^+^, x*^+^): fix 
X = x*^ in P, and solve the resulting subproblems SP|^ P; there are three 
cases to consider: 

a) SP|^ is primal unbounded for some i — halt with the original problem 
having unbounded solution. 

b) y!^, u?' are respectively primal and dual optimal solutions of subproblem 
SP?^ with objective values /3f for each i — there are two cases to consider: 

i. J2i=i Pi = -2* bait with (z*^,x'^,yi, . . . ,yj) as the optimal solution 
to the original problem. 

ii. J2i=i Pi > ^dd the Benders Cuts Pi > u|'(bi — Gix) to RMP*^ 
to form the new relaxed master problem RMP*^+^ set k = k+1 and 
return to (2). 

c) SP!^ is dual unbounded or both primal and dual infeasible for some i 
— find an extreme direction d!^ of the homogeneous dual leading to 
unboundedness; add the cut d?^(bi — Gix) < 0 to RMP*^ to form the 
new relaxed master problem RMP'^+^ set k = k + I and return to (2). 
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2.2 Hybrid Benders Decomposition 

The classical linear Benders Decomposition can be generalised to cover problems 
in which the constraints and objective function are nonlinear, using any appro- 
priate solution method for RMP*^ and SP?' — we require only a procedure 
for generating valid lower bounds /3f (x) from the solutions of SP|^. In its most 
general form we have the original problem: 

P: min /(/i(x, yi), . . . , //(x, yi)) 
subject to 3 i(x,yi) > bi Vi 
xGVx 
yi G Vy Vt 

which we decompose into the master problem: 

MP : min z = /(x, Pi..., Pi) 
subject to Pi > Pi(x) Vi Wk 
0 > Pi(x) Vz yi 
xGVx 

and subproblems: 

SP!" : min /*(x’^,yi) 
subject to 5 i(x*",yi) > bi 
yi G Vy 

In particular when we can identify one or more distinct sets of variables in which 
the problem constraints and objective function are linear and a complicating set 
of variables, it will be useful to decompose the problem into a nonlinear relaxed 
master problem and linear subproblems. 

3 Embedding Benders Decomposition in Constraint 
Programming 

In this section we discuss the implementation of Benders Decomposition in 
ECLiPSe . In designing the structure of the implementation two important con- 
siderations were to maintain the flexibility of the approach and to ensure ease 
of use for non-mathematicians. 

The flexibility of hybrid Benders Decomposition algorithms is due in large part to 
the possibility of using arbitrary solution methods for master and subproblems; 
in order to allow appropriate solvers to be simply slotted in to the framework it 
is essential to cleanly separate the method of solution of master and subproblems 
from the communication of solutions between them. 

As many users of the solver may be unfamiliar with the intricacies of linear 
programming and duality theory, it is important to provide a user interface 
that allows for problems to be modeled in a natural and straightforward for- 
mulation. All constraints are therefore input in their original formulation — i.e. 
without having been decomposed and dualised and containing both master and 



( 6 ) 



(7) 



(8) 




8 



A. Eremin and M. Wallace 



subproblem variables. The sets of variables occurring solely in the subproblems 
are specified when the optimisation is performed, and the original problem con- 
straints automatically decomposed into master and subproblem constraints and 
the subproblems dualised. 

3.1 ECLiPSe Implementation 

The implementation of Benders Decomposition in ECLiPSe uses the same fea- 
tures of the language that are used to implement finite domain and other con- 
straints. These are demons, variable attributes, waking conditions, and priorities. 

A demon is a procedure which, on completing its processing, suspends itself. 
It can be woken repeatedly, each time re-suspending on completion, until killed 
by an explicit command. Demons are typically used to implement constraint 
propagation. For Benders Decomposition a demon is used to implement the 
solver for the master problem, with separate demons for each subproblem. 

A variable attribute is used to hold information about a variable, such as its 
finite domain. Programmers can add further attributes, and for Benders decom- 
position an attribute is used to hold a tentative value for each of the variables 
in the master problem. Each time the master problem is solved, the tentative 
values of all the variables are updated to record the new solution. 

When the waking conditions for a demon are satisfied, it wakes. For a finite 
domain constraint this is typically a reduction in the domain of any of the 
variables in the constraint. For the subproblems in Benders Decomposition the 
waking condition is a change in the tentative values of any variable linking 
the subproblem to the master problem. Thus each time the master problem is 
solved any subproblem whose linking variables now have a new value is woken, 
and solved again. The master problem is woken whenever a new constraint (in 
the form of a Benders cut) is passed to the solver. Thus processing stops at some 
iteration either if after solving the master problem no subproblems are woken, 
or if after solving all the subproblems no new cuts are produced. 

Priorities are used in ECLiPSe to ensure that when several demons are woken 
they are executed in order of priority. For finite domain propagation this is 
used to ensure that simple constraints, such as inequalities, are handled before 
expensive global constraints. By setting the subproblems at a higher priority 
than the master problem, it is ensured that all the subproblems are solved and 
the resulting Benders cuts are all added to the master problem, before the master 
problem itself is solved again. While it is possible to wake the master problem 
early with only some cuts added by setting lower priorities for subproblems, this 
proved ineffective in practice. 

4 Benders Decomposition for Scheduling Problems 

4.1 Minimal Perturbation in Dynamic Scheduling with Time 
Windows 

The minimal perturbation dynamic scheduling problem with time windows and 
side constraints is a variant of the classic scheduling problem with time windows: 
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given a current schedule for a set of n possibly variable duration tasks with time 
windows on their start and end time points, a set C of unary and binary side 
constraints over these time points and a reduced number of resources r we are 
required to produce a new schedule feasible to the existing time windows and 
constraints and the new resource constraint that is minimally different from the 
current schedule. 

The user enters these problems in a simple form that is automatically translated 
into a set of constraints that can be passed to the bd library. For the purposes of 
this paper, in the next section we give the full model generated by the translator. 
The subsequent section reports how this model is split into a master/subproblem 
form for Benders Decomposition 



4.2 The Constraints Modeling Minimal Perturbation 

For each task Ti in the current schedule with current start and end times tg . , te^ 
respectively there are: 

Time point variables for the start and end of the task Sj, and task dura- 
tion constraints 

Ci) € Hi (9) 

where T>i = {(s, e) : e — s > k, e — s < Ui, < s < Ug^, 1^ ^ e ^ Usi} and 
Ig - , Ug . , lei ^ '^ei ,h,Ui are derived from the time windows of the task start and 
end points and any constraints on these time points in C. 

Perturbation cost variables Cg,, Cg, and perturbation cost constraints 

(Cg^ , Sii Cei , Ci) € Pi 

where Vi = {(cg, s,Ce,e) : Cg > s - tg^ , Cg > tsi~s, Cg > e-tg,, Cg > tg^ - e} 
so that Cg, > |si - tg,|,Cg, > |ei - fgj 

For each pair of tasks Ti,Tj there are: 

Binary non-overlap variables Prcij, Postij for each task Tj yf Ti which take 
the value 1 iff task i starts before the start of task j and after the end of 
task j respectively, so that we have 



Preij 



1 if Si < Sj 
0 if Si > Sj 



Postij 



1 if Si > 6j 
0 if Sj < 6j 



and the distances between the time points Si and Sj , Cj are bounded by 



Si — Sj > 

^i ^j Hi 

Si Cj ^ 

Si Cj ^ 



(j'Si 

^Si l) P^f^ij ) 

(Ue, - Isi) Postij + {isi - Uej) 
(ugi - lej + l) Postij - 1 



( 11 ) 
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The resource feasibility constraint that the start time point Si overlaps 
with at most r other tasks 

{Prcij + Postij) > n — r — 1 (12) 

Time point distance constraints between Si,ei and all other time points. 
Since for each task Tj ^ Ti we have the distance bounds (ffH) between Si and 
Tj and between Sj and of which at most half can be binding, we combine 
them with the binary constraints 

Si ^ Sj -\- hij Cj ^ 5^ “t“ ^Uij 

Sj > 6j + bi^. 6i > ej + hf... 

appearing in the constraint set C to give the distance constraints 

: Pij : Pij : Pij ) ^ /' 1 



where 

{ ( ; ^2 , Sj , Cj , , T , t/) . 

Si Sj ^ B ^ Si Cj ^ T, Si “t“ €j ^ t/, Cj Cj ^ } 

{{si,ei,Sj,ej,B, L, U, Preij.PrCji, Postij) : 

B>hj,L> bi--,U > bui^, 

B ^ P — “t“ l) P^Oji (^Isi 7 

L > {Ue- - Isi) Postij + {Is- - Uej) , 

U > {lej - Usi - l) Postij + 1} 

Valid ordering constraints for each task Tj ^ Ti there are many additional 
constraints that we may choose to introduce restricting the binary variables 
to represent a valid ordering. These constraints are not necessary for the 
correctness of the algorithm as invalid orderings will be infeasible to the 
subproblem, but may improve its efficiency as fewer iterations will be needed. 

The complete MILP problem formulation is then 



Bij — 



O^J = 



P : min (cg, + Cez) 



subject to 

(Cfiz ; -^zi ^ez j ^z) ^ Pi 
(^zj ^z) ^ Pi 

(^z ; Oil Oj I Bij , Lij , Uij ) G Bij 

(i^i: Oil Sj I Gj I Bij I Tij I Uij I PvCij I PvCjii Postij) P ij 

J2j=^i(P^^ij + > n-r-1 



Vj ^ i 



. Vi 



(14) 
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4.3 Benders Decomposition Model for Minimal Perturbation 
Master Problem. 



MP : min z 
subject to 

/3"(B,L,U) 

/3'(B,L,U) 

(■^2 5 7 5 7 ^ij ) ^ij ? ^ P OStij ) 

(Preij + Postij) 



< 


Z 


Vfc 


< 


0 


VZ 


e 




Vi 


> 


n — r — 1 1 



(15) 



Subproblem. There is a single subproblem with primal formulation 



LP*^ : min ^ (cg; + CeJ 

i=l 

subject to 

(cg^ 5 Cez 5 Cz) € Pz 
(-^zi Cz) G Pz 

(^Z J t-Z 5 Z t-J Z -^ZJ Z -^ZJ Z ^ZJ ) ^ ^ZJ ^ ^ 



Vz 



(16) 



The Benders Decomposition library in ECLiPSe automatically extracts a dual 
formulation of the subproblem. For the current subproblem LP*^, the dual has 
the form: 



n / 

SP*^ : max jWBij + PijWLij + UijWu,j) 

z=i y j^i 

subject to Y^j^i + WLij - Wu.. - WBji) 

+ Wt^. - Wi^ + Wui + Wi^, - Wu,. < 0 



( 



Wb^,. - WLii - WUii - Wb, 

Wt,. + Wl, - Wu, + - Wzze. < 0 



> -1 

Wt,. z tUi, . < 1 



Wl, , Wui z Wl,, z Wu,. Z W^, , Wu,. > 0 

wbb z wlb z wu,„ , Wb,.^ > 0 Vj yf * 



> Vi 



where 



ttz ^s,Wt,^ 

ls,Wl^^ 



+ te,Wt,^ + kwi, + UiWu, + Yj^i + 

- Us,Wu^^ + le.WB, - Ue^Wu,. 



(17) 



Solutions to SP*^ produce cuts of the form z > /3^(B, L, U) which exclude order- 
ings with worse cost from further relaxed master problems when the subproblem 
is feasible, or /3^(B,L,U) < 0 which exclude orderings infeasible to the start 
windows and durations of the tasks when the subproblem is infeasible, where 



/3"(B,L,U) 



1 

E 




+ E {^BijPij + WB,-L,j + WB^.Uij'^ 

j¥=i 
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All coefficients and constants in the cuts are integral since the subprob- 
lems are totally unimodular. 

4.4 Results and Discussion 

Summary. We ran this model on 100 minimal perturbation problem instances. 
The number of variables in the problem model was around 900, and there were 
some 1400 constraints in the master problem and around 20 in the subproblem. 
Most problems were solved within 10 iterations between master and subproblem, 
though a few notched up hundreds of iterations. 

The time and number of iterations for each problem are given in Table [H The 
bulk of the time was spent in the finite domain search used to solve the master 
problem. Typically, for the feasible instances, the optimal solution was found 
early in the search, and much time was wasted in generating further solutions 
to the master problem which were not better in the context of the full problem. 

Correct and optimal solutions to all the problems were returned, but the per- 
formance was an order of magnitude slower than the specially designed algorithm 
presented in HU. 

Analysis. Minimal perturbation can be decomposed into a master and sub- 
problem for the Benders Decomposition approach, but the size of the problems 
is very disparate. The behaviour of the algorithm on the benchmark problem 
reflect the number of constraints - the subproblems are trivial and almost all the 
time is spent in the master problem. The imbalance is probably an indication 
that this algorithm is better suited to problems with larger or more complex 
subproblems. 

Nevertheless it is not always the number of constraints that make a problem 
hard, but the difficulty of handling these constraints. It may be that the master 
problem constraints, while numerous, are easy to handle if the right algorithm 
is used. 

Currently the algorithm used to solve the master problem is a two-phase 
finite domain labelling routine. In the first phase a single step lookahead is 
used to instantiate binary variables that cannot take one of their values. In 
the second step all the binary variables are labelled, choosing first the variables 
at the bottleneck of the minimal perturbation scheduling problem. This is not 
only a relatively naive search method, but it also lacks any active handling of 
the optimisation function. Linear programming does offer an active handling of 
the optimisation function. Thus, using a hybrid algorithm to tackle the master 
problem within a larger Benders Decomposition hybridisation form, could be 
very effective on these minimal perturbation problems. 

Benders Decomposition has proven to be a very efficient and scalable ap- 
proach in case the problem breaks down into a master problem and multiple 
subproblems. The minimal perturbation problems benchmarked in this paper 
involve a single kind of resource. These problems do not have an apparent de- 
composition with multiple subproblems. This is a second reason why our bench- 
mark results do not compete with the best current approach, on this class of 



Hybrid Benders Decomposition Algorithms 



13 



problems. Minimal perturbation problems involving different kinds of resources 
might, by contrast, prove to be very amenable to the Benders Decomposition 
form of hybridisation. 



Table 1. Number of iterations and total solution time for Benders Decomposition on 
RFP benchmark data 



Problem 


Iterations 


Time 


Problem 


Iterations 


Time 


Problem 


Iterations 


Time 


1 


11 


4.92 


35 


4 


1.09 


69 


26 


39.48 


2 


12 


3.16 


36 


20 


7.06 


70 


13 


4.86 


3 


10 


2.40 


37 


22 


20.91 


71 


- 


>200 


4 


15 


11.30 


38 


36 


67.48 


72 


- 


>200 


5 


16 


7.93 


39 


59 


184.57 


73 


- 


>200 


6 


58 


109.22 


40 


13 


5.66 


74 


26 


18.72 


7 


25 


19.82 


41 


28 


27.05 


75 


91 


154.00 


8 


10 


3.27 


42 


9 


5.86 


76 


12 


3.49 


9 


32 


16.25 


43 


39 


21.02 


77 


54 


111.17 


10 


107 


151.01 


44 


25 


9.43 


78 


35 


37.52 


11 


- 


>200 


45 


11 


5.20 


79 


44 


38.00 


12 


- 


>200 


46 


- 


>200 


80 


10 


3.56 


13 


44 


96.77 


47 


5 


1.37 


81 


28 


12.69 


14 


29 


18.30 


48 


51 


51.75 


82 


8 


2.01 


15 


70 


83.87 


49 


9 


2.06 


83 


16 


14.52 


16 


20 


30.96 


50 


18 


8.80 


84 


32 


22.24 


17 


23 


11.65 


51 


30 


19.44 


85 


20 


4.94 


18 


18 


15.16 


52 


43 


119.66 


86 


- 


>200 


19 


14 


4.94 


53 


28 


26.10 


87 


18 


9.56 


20 


21 


8.17 


54 


33 


17.32 


88 


12 


4.72 


21 


19 


5.01 


55 


14 


6.01 


89 


7 


2.26 


22 


60 


180.47 


56 


14 


9.95 


90 


43 


42.51 


23 


20 


8.46 


57 


45 


100.94 


91 


8 


2.12 


24 


39 


82.93 


58 


4 


0.88 


92 


54 


111.5 


25 


13 


2.74 


59 


8 


2.45 


93 


- 


>200 


26 


3 


0.71 


60 


- 


>200 


94 


25 


8.08 


27 


10 


7.14 


61 


19 


9.41 


95 


8 


2.99 


28 


22 


12.23 


62 


24 


11.48 


96 


22 


10.97 


29 


27 


13.24 


63 


- 


>200 


97 


5 


1.59 


30 


- 


>200 


64 


46 


95.07 


98 


6 


2.37 


31 


42 


36.69 


65 


30 


18.62 


99 


15 


4.82 


32 


15 


4.48 


66 


14 


5.57 


100 


19 


47.61 


33 


15 


8.77 


67 


10 


3.10 








34 


20 


23.70 


68 


62 


132.87 
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5 Conclusion 

This paper has investigated hybridisation forms for problems that admit a de- 
composition. A variety of hybridisation forms can be used in case one or more 
subproblems are handled by linear programming. We aim to make them all avail- 
able in the ECLiPSe language in a way that allows users to experiment easily 
with the different alternatives so as to quickly find the best hybrid algorithm for 
the problem at hand. 

Benders Decomposition is a technique that has not, to date, been applied to 
many real problems within the CP community. Publications on this technique 
have described a few pedagogical examples and “academic” problem classes such 
as satisfiability I20I2TI . This paper presents the first application of Benders De- 
composition to a set of minimal perturbation problems which have immediate 
application in the real world. Indeed the benchmarks were based on an industrial 
application to airline scheduling. 

The significance of Benders Decomposition in comparison with other mas- 
ter/subproblem forms of hybridisation (such as row and column generation) is 
that it takes advantage of linear duality theory. The Benders Decomposition 
library in ECLiPSe harnesses the power of the dual problem for constraint pro- 
grammers who may not find the formulation and application of the linear dual 
either easy or natural. 

Moreover the implementation of Benders Decomposition in ECLiPSe has 
been proven both efficient and scalable. Indeed its results on the minimal pertur- 
bation benchmark problems compare reasonably well even against an algorithm 
specially developed for problems of this class. However the Benders Decompo- 
sition for minimal perturbation problems comprises a master problem and a 
single trivial subproblem. Our experience with this technique has shown that 
this hybridisation form is more suitable to applications where the decomposition 
introduces many or complex subproblems. 

This paper was initially motivated by a network application where Benders 
Decomposition has proven to be the best hybridisation form after considerable 
experimentation with other algorithms. We plan to report on the application of 
this technique to a problem brought to us by an industrial partner in a forth- 
coming paper. 

There remains further work to support fine control over the iteration between 
the master and subproblems in Benders Decomposition. The importance of such 
fine control has been clearly evidenced from our ECLiPSe implementation of 
another hybridisation form - column generation - applied to mixed integer prob- 
lems. In particular we will seek to implement early stopping, and more control 
over the number of Benders cuts returned at an iteration. 
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Abstract. We present Branch-and-Check, a hybrid framework integrating Mixed 
Integer Programming and Constraint Logic Programming, which encapsulates 
the traditional Benders Decomposition and Branch-and-Bound as special cases. 
In particular we describe its relation to Benders and the use of nogoods and linear 
relaxations. We give two examples of how problems can be modelled and solved 
using Branch-and-Check and present computational results demonstrating more 
than order-of-magnitude speedup compared to previous approaches. We also men- 
tion important future research issues such as hierarchical, dynamic and adjustable 
linear relaxations. 



1 Introduction 

The first goal of this paper is to propose a modeller/solver framework, Branch-and- 
Check, that not only encompasses both the traditional Benders Decomposition and 
Branch-and-Bound schemes of Mixed Integer Programming (MIP) as special cases of 
a spectrum of solution methods, but also adds an extra dimension by allowing the inte- 
gration of Constraint Logic Programming (CLP) in a MIP style branching search. 

In this framework we model a problem in a mixture of CLP and MIP. The CLP part of 
the model then adds a relaxation of itself to the MIP part (or it is added explicitely). If the 
two parts do not use the same variables then the model should include mapping relations 
between them (shadowed variables). The solution method is then a branching search, 
solving the LP relaxation of the MIP part at every node and branching on the discrete 
variables, but only solving the CLP part at the nodes of the branching tree where it is 
advantageous (or necessary), e.g., based on how difficult/large the MIP part is compared 
to the CLP part or how easy it is to strengthen the MIP part using the CLP solution and 
on the quality of those cuts. 

The second goal of this paper is to identify one of the key elements for the inte- 
gration of CLP and MIP that has still not been adequately addressed and to propose 
it as a pertinent and pressing research topic in the area of integration: Dynamic linear 
relaxations of global constraints. We will present computational results that indicate that 
for efficient communication between the different parts of a hybrid model some double 
modelling is required, i.e., the same constraint or parts of the model must be present in 
both CLP and MIP form. It is also vital that the different forms of the same constraint 

T. Walsh (Ed.): CP 2001, LNCS 2239, pp. 16-[30] 2001. 
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communicate (intra-constraint communication). This is what we have previously termed 
mixed propagation of mixed CLP-MIP global constraints 12012 11241 . 

This holds regardless of the scheme used, be it (Tight) Cooperation Mixed 

Logical/Linear Programming (MLLP) 111.51161201211241 . Branch-and-Check (see Sec. El) 
or some other integration approach IBI6I7I18123II . This double modelling could be ex- 
plicit, but most preferably it should be implicit, i.e., mixed global constraints should 
post and dynamically update a linear relaxation of themselves, in addition to the classi- 
cal CLP propagation on their discrete parts and mixed propagation between the discrete 
and continuous parts. 

This extends the idea proposed by Beringer and De Backer in f2j. They argued 
that the standard CLP architecture is not optimal because cooperation between different 
solvers is only done by value propagation. In addition, they proposed, solvers, e.g., CLP 
and MIP solvers, should be able to communicate by exchanging information through 
variable bounds. Variable bounds are only a special type of linear constraints, so linear 
relaxations take this idea a step further and open up many more possibilities in CLP-MIP 
integration. 

We will exemplify this based on our experiences when developing the Branch-and- 
Check framework and also based on our previous line of research 1 151 1612012 11241 . 
The paper is organised as follows. This section outlined the focus of this research. 
Section El reviews the history of efforts in integrating CLP and MIP along with two 
classical MIP techniques. Benders Decomposition and Branch-and-Bound. In Sec. |3| 
we introduce the Branch-and-Check framework, discuss how it generalises Benders 
and Branch-and-Bound, and show how CLP can be integrated. Section [4| then gives 
two examples. Scheduling with Dissimilar Parallel Machines and Capacitated Vehicle 
Routing with Time Windows, and presents computational results demonstrating more 
than order-of-magnitude speedup compared to previous approaches. 



2 Background 

2.1 Classical MIP Techniques 

Branch-and-Bound: We will assume that the reader is familiar with the classical 
Branch-and-Bound approach for solving MIPs. Due to different vocabulary in the two 
fields, however, we would like to note that this is the technique that is sometimes referred 
to as Branch-and-Relax 0. 



Benders Decomposition: Classical Benders Decomposition exploits the fact that in 
some problems, fixing the values of certain difficult variables simplifies the problem 
tremendously. By enumerating those difficult variables, solving each resulting subprob- 
lem and selecting the best subproblem solution found, the original problem can be solved. 
Benders’ method |[T) is more ingenious. It solves a master problem to assign values to 
the difficult variables. Each solution to the subproblem then generates a Benders cut 
that is added to the master problem before resolving it. Thus each solution to the master 
problem must satisfy all the Benders cuts obtained so far, avoiding searching similar 
regions of the solutions space again. This is similar to the role nogoods play in CLP. 
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Classical Benders Decomposition applies if the problem can be written 

min cx + fy 

s.t. Ax + Gy > b, (1) 

X £ D, ye M”. 

If X* denotes the solution to the master problem then the subproblem is an LP, 

min cx* + fy 

s.t. Gy>b—Ax*, (2) 

2/GR+, 

which is easily solved. The procedure is iterative, interleaving solving the master problem 
to optimality and the resulting subproblem. By applying duality theory to the solution 
of the subproblem, cuts can be generated that are added to the master problem 

min z 

s.t. z >u*{b — Ax) + cx, 
u* {b — Ax) < 0, 

X G D, 

where u* is the dual solution to the subproblem when the subproblem is feasible in 
iterations q G Qi, and infeasible in iterations q G Q 2 , before resolving the master 
problem in the next iteration. A more detailed description of Benders Decomposition 
can be found in 11181111141171 . 

2.2 Previous Integration Schemes 

Properties of a number of different problems were considered by Darby-Dowman and 
Little in ma and their effect on the performance of CLP and MIP approaches were 
presented. They reported experimental results that illustrate some key properties of the 
techniques: MIP is very efficient for problems with good relaxations, but it suffers 
when the relaxation is weak or when its restricted modelling framework results in large 
models. CLP, with its more expressive constraints, has smaller models that are closer to 
the problem description and behaves well for highly constrained problems, but it lacks 
the “global perspective” of relaxations. 

(Tight) Cooperation: Beringer and De Backer proposed in 12] that CLP and MIP solvers 
can be coupled together with common (or shadowed) variables in a double modelling 
framework using two way communication: The MIP solver sends the CLP solver the 
values of the common variables that are fixed, which relies on the MIP solver being able 
to detect implied inequalities, and the CLP solver sends the MIP solver strengthened 
bounds for the common variables. They compared solving a multi-knapsack problem as 
a pure CLP or a pure MIP against using the cooperation of the two solvers, and obtained 
favourable results. 

Refalo proposed an extension to this framework in 1221 , where the MIP model is 
dynamic; it is restated when variable bounds are tightened or variables are fixed, by the 
CLP solver. 



q&Qi, (3) 

q G Q2, (4) 
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Mixed Logical/Linear Programming (MLLP): Hooker et al. proposed a new mod- 
elling paradigm to efficiently integrate CLP and MIP in II12I13115I16I . In that framework, 
constraints are in the form of conditionals that link the discrete and continuous elements 
of the problem. An MLLP model has the form 

min cx 

s.t. hi{y) A''x > &*, is/, (5) 

y & D, X s K". 

The antecedents hi (y) of the conditionals are constraints that can be treated with CLP 
techniques. The consequents are linear inequality systems that form an LP relaxation. 

An MLLP problem is solved by branching on the discrete variables. The conditionals 
assign roles to CLP and LP: CLP is applied to the discrete constraints to reduce the search 
and help determine when partial assignments satisfy the antecedents. At each node of 
the branching tree an LP solver minimises cx subject to the inequalities A^x > 6® for 
which hi (y) is determined to be true. This delayed posting of inequalities leads to small 
and lean LP problems that can be solved efficiently. 

Ottosson, Thorsteinsson and Hooker, and Ottosson and Thorsteinsson extended 
MLLP in H2 1 1241 by proposing adding mixed global constraints that have both discrete 
and continuous elements within them. A mixed global constraint has a dynamically stated 
linear relaxation that becomes a part of the continuous part and propagates information 
between the discrete and continuous parts of the model. In that framework the mixed 
global constraints serve both as a modelling tool and a way to exploit structure in the 
solution process. Mixed global constraints can be written in the form (0i as conditionals, 
analogous to global constraints in CLP, but improve the solution process by improving 
the propagation. 



Hybrid Decomposition: Jain and Grossmann, and Harjunkoski, Jain and Grossmann 
presented a scheme in I9I18II where the problem is decomposed into two sub-parts, one 
handled by MIP and the other by CLP. This is demonstrated using a multi-machine 
scheduling problem where the assignment of tasks to machines is modelled as a MIP 
and the sequencing of the tasks on the assigned machines is handled using CLP. The 
search scheme is an iterative procedure where the assignment problem is first solved to 
optimality, identifying which machine to use for each task, and then a CLP feasibility 
problem is solved trying to sequence according to this assignment. If the sequencing fails, 
cutting planes are added to the MIP problem to forbid this (and subsumed) assignments 
and the process is iterated. This approach has many similarities to Benders and in fact, 
in (H it is shown how this problem can be written for Benders. 



Other Approaches: Bockmayr and Kasper proposed an interesting framework in Q 
for combining CLP and MIP, in which several approaches to integration or synergy are 
possible, by dividing the constraints for both CLP and MIP into two different categories, 
primitive and non-primitive. Primitive constraints are those for which there exists a 
polynomial time solution algorithm and non-primitive constraints are those for which 
this is not true. 
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Rodosek et al. presented in ll23l] a systematic approach for transforming a CLP model 
into a corresponding MIP model. CLP is then used along with linear relaxations in a 
single search tree to prune domains and establish bounds. The downside of this approach 
is that the systematic procedure that creates the shadow MIP model for the original CLP 
model includes reified arithmetic constraints, big-M constraints. A translation involving 
numerous big-M constraints may result in a poor MIP model, i.e., with a poor linear 
relaxation. 

3 Branch-and-Check 

3.1 Description of the General Method 

Branch-and-Check builds to a certain extent on Benders Decomposition. The basic idea 
is to identify a part of the problem that is basic and a part that is delayed. The solution 
process is a branching search on the basic part where the delayed part is checked (e.g., for 
feasibility) as late or as seldom as possible. The rationale is that while the delayed part is 
necessary to check for the correctness of the solution, it may be large and computationally 
expensive to include in every step of the calculations and thus we want to delay looking 
at it as long as possible. We are going to refer to the basic part as the master problem 
and the delayed part as the subproblem. 

This strategy can be applied to a problem of this general form: 



min cx + f{y) 


(6) 


s.t. Ax < b, 


(7) 


H{x,y), 


(8) 



where the problem is naturally split into a mixed integer linear part 0 and a non- 
linear part ([8]l, e.g., (mixed) global constraints such as the piecewise-linear and 
alldiff erent constraints. The constraints of master problems are in the top part and 
the constraints of the subproblems are in the lower part. The non-linear part can also 
include linear constraints or mappings between the x and y variables. Thus the following 
would be examples of problem forms this strategy can be applied to: 

min cx + f{y) 
s.t. Ax < b, 

H{x,y), 

In the third form, x ^ y represents that there is a mapping between the values of the 
variables x and y, e.g., a one-to-one mapping between two variables or between a variable 
and a set of variables such as y G {1, • ■ ■ , n}, Xi , ... ,Xn G {0,1} and Xy = 1. This 
second mapping is a common mapping between CLP and MIP. 

Since the master problem is a relaxation of the original problem, when a solution 
to the master problem is found in the branching search it is not guaranteed that the 
solution is truly feasible nor that the objective value is correct. At those nodes in the 
branching tree, i.e., where all the variables in the master problem have been instantiated 
and the branching search is about to fathom the subtree, we solve the subproblem as 



min cx + dy 
s.t. A^x < b^, 
A^y < 6 ^, 



mm 

s.t. 



cx + f{y) 
Ax < b, 
x^y, 

F{y)- 
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well to determine if the overall solution is feasible and then what its correct objective 
function value is. We can solve the subproblem more often, but how often to consult the 
subproblem is a matter of how large or computationally expensive the subproblem is 
compared to the master problem. 

Completely ignoring the subproblem for most of the solution process, only solving 
it at selected nodes, is not going to work, however, so we augment the master problem 
with a relaxation of the subproblem: A simpler and computationally less expensive 
representation of the subproblem that focuses the master problem on good candidate 
solutions with respect to the subproblem. For example, for the third form, the master 
problem would become 



min cx + Cf(^y'j{x) 
s.t. Ax < b, 

^F{y) ( 2 :)- 

The relaxation should be hierarchical if possible, e.g., if the subproblem is a CLP then 
the whole relaxation should preferably be the union of the relaxations of the individual 
global constraints that comprise the subproblem. It should also be dynamic, i.e., as the 
solution process progresses it should be updated, e.g., when variables are fixed; and 
adjustable, i.e, it should be possible to efficiently make incremental changes, rather than 
have to recompute it at every node. 

Whenever the subproblem is solved, cuts are added to the master problem. We add a 
lower bounding cut if the subproblem is feasible, bounding the objective function from 
below, or an infeasibility cut (a nogood) if the subproblem is infeasible, disallowing this 
solution and others similar to it. For example, for the third form, the master problem 
would become 



min cx + z 
s.t. Ax < b, 

z > L{x), 
N{x). 



The subproblem in this case will be 



min f{y) 
s.t. F{y), 



given the mapping x ^ y between the variables in the master and subproblems, i.e., 
some of the variables in the subproblem may be fixed or have restricted values based on 
the current solution of the master problem. In the examples we will look at in Sec.01 the 
solution to the master problem will determine how the subproblem decomposes. 
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The master problem for the general form ®-([8]l is 



min 


cx + z 


(9) 


s.t. 


Ax < b, 


(10) 






(11) 






(12) 




z > L{x), 


(13) 




N(x), 


(14) 


and the corresponding subproblem is 


min 


cx* + f{y) 


(15) 


s.t. 


H{x*,y), 


(16) 



where x* is the solution to the master problem. 



3.2 Special Cases 

Benders Decomposition: The correspondence to the Branch-and-Check framework is 
that a problem solved using classical Benders has an empty basic part (see O) and 
no relaxation of the subproblem (see (fTTIl-n^). It only has general (i.e., non-problem 
specific) lower bounding cuts © (see dT3] l) and nogoods ([4j (see (fTTt ) that are derived 
using LP duality theory. 



Branch-and-Bonnd: Classical Branch-and-Bound, 

min cx 

s.t. Ax < b, (17) 

X G R", some Xi G Z, 

is at the other extreme, it has an empty delayed part, and hence no relaxation of the 
subproblem, no lower bounding cuts and no nogoods or only the trivial nogoods that 
are implicit in the branching and fathoming scheme (see (fTTh - rfm v It only has a basic 
part (see ifToll l. 

3.3 Integrating MIP and CLP 

It is immediately obvious that a spectrum of techniques exist between classical Benders 
Decomposition and Branch-and-Bound. In particular; 

- In Benders the solution process might be accelerated by adding some cuts or valid 
inequalities a priori, i.e., adding a linear relaxation of the subproblem (ITT])-(fT^. 
instead of starting with an empty master problem and waiting for the Benders cuts 
to accumulate and start guiding the process (in the master problem) to promising 
candidate solutions. 
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- Instead of looking at the entire problem at every node of the Branch-and-Bound 
search tree, a part of the set of variables/constraints can be delayed and only examined 
when need arises. This will result in smaller problems being solved at each node, 
which although more nodes may be needed, may still result in overall savings. 

We note, however, that in addition to this merger of Benders and Branch-and-Bound, 
the Branch-and-Check framework also allows for an additional dimension of flexibility. 
The subproblem can be of almost any form, in particular MIP and CLP can be integrated 
by using CLP to model and solve the subproblems. The MIP search in the master problem 
is still guided by the subproblem, via the relaxation (ITTl - dT^ and the lower bounds and 
nogoods fT3b-(fTO. 

It is true that if the subproblem is not an LP, or more accurately if duality theory is not 
available, more work has to be put into deriving the lower bounds and nogoods. A survey 
of different duality concepts for a variety of problem classes can be found in fl4ll . It 
is not uncommon in CLP and MIP, however, to have to tailor methods for specific 
structures. For example, global constraints in CLP require that propagation algorithms 
be designed for each one and in MIP, problem specific cutting planes are widely used. 
In a similar fashion, when integrating CLP and MIP, work has to be put into deriving 
linear relaxations of mixed global constraints. 

3.4 Relation to Previous Work on Decomposition Methods and Nogoods 

The first key idea for extensions to the classical Benders framework was due to Jeroslow 
and Wang fT9ll . They envisioned the dual of a problem (in the case of classical Benders, an 
LP) as an inference problem, by showing that when LP demonstrates the unsatisfiability 
of a set of Horn clauses in propositional logic, the dual solution contains information 
about a unit resolution proof of unsatisfiability. 

Hooker defined the general inference dual in , which was then used by Hooker and 
Yan in IflTI for a logic-based Benders scheme in the context of logic circuit verification. 
There are many similarities between that paper and the paper of J ain and Grossmann lUfil , 
except that Hooker and Yan used a specialised inference algorithm rather than a general 
CLP package for the subproblem, and the problem was logic circuit verification rather 
than machine scheduling. 

Benders Decomposition for Branching, generating Benders cuts from an LP sub- 
problem while in the process of solving the master problem, was described by Hooker 
in HD. This is the essence of Branch-and-Check, in the context of classical Benders; 
the examples there do not solve the subproblem with CLP. We go a step further by using 
a CLP solver to get the cuts for Branch-and-Check, and in addition, we give the first 
computational results for Branch-and-Check in a Benders context. Branch-and-Check 
as defined here is a form of Generalised Benders (it partitions the variables and only 
uses some of them in the master problem, which is the core of Benders) that generates 
cuts in the process of solving the master problem once. 

The idea of using nogoods in branching is a standard AI technique. Branch-and- 
Check is different in that only a relaxation of the problem, rather than the full problem, 
is solved at each node. The full problem is consulted at only a few nodes, and nogoods 
generated accordingly. In classical AI, the full problem would generally be checked at ev- 
ery node. The optimisation community has apparently never used nogoods in branching 
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search and the constraint satisfaction community has apparently never used generalised 
Benders as a means to generate nogoods, although Beringer and De Backer have done 
related work. The integration of Benders and CLP could give new life to the idea of a 
nogood, which has received limited attention in practical optimisation algorithms. 



4 Examples 

In this next section, we will examine two problems. Scheduling with Dissimilar Parallel 
Machines (SDPM) and Capacitated Vehicle Routing with Time Windows (CVRTW), 
that benefit from using CLP to model and solve the subproblems, and demonstrate some 
of the issues that arise. 



4.1 Scheduling with Dissimilar Parallel Machines 

This problem and a decompositional method to solve it was first presented by Jain and 
Grossmann in IlSIl . The problem is described as follows: The least cost schedule has to be 
derived for processing a set of orders with release and due dates using a set of dissimilar 
parallel machines. The machines are dissimilar in the sense that there is different cost and 
processing time associated with each order-machine pair, but all the machines perform 



the same job. Jain and Grossmann modelled the problem thus: 

min ^ ^ CimXim 


(18) 


i&I m^M 


S.t. tSi ^ ^ ^ ^ 


Vz G I, 


(19) 


m^M 


^ ^ ^im — I5 


Vz G I, 


(20) 


m^M 


'^PimXim < ma,Xi{di} - minijrj, 


Vm G M, 


(21) 


i^I 


if {xira = 1) then {zi = m), 


Vz G I,Vm G M, 


(22) 


z. start < di — , z. start > 


Vz G I, 


(23) 


z. duration = pz^, 


Vz G /, 


(24) 


i requires , , 


Vz G I. 


(25) 



They also presented a decompositional method that solves this class of MIP problems, 
i.e., in which only a subset of the variables appears in the objective function. The problem 
decomposes into an optimisation problem dl8ll - (Ern that is suitable for MIP (has all of 
the variables of the objective function and a tight relaxation), and into a feasibility 
problem (I23t - f25b that can be solved efficiently using CLP. The variables of the two 
parts are linked using the mapping The constraints (ITil are not necessary for the 
correctness of the problem, but are valid inequalities for the overall problem that are 
added to the MIP part. 
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Table 1. Results for 5 x 23 problems using Jain & Grossmann’s approach. 



Problem Model Siz^Find and Prove Opt. Solution 

Number Mach. Jobs Iter. Nogoods MIP sec CLP sec 







1 


5 


23 


33 


71 


42.10 




0.54 








2 


5 


23 


16 


15 


0.93 




0.37 








3 


5 


23 


33 


76 


9.15 




0.47 








4 


5 


23 


43 


104 


14.05 




0.60 








5 


5 


23 


57 


72 


13.07 




1.01 






Table 2 


Results for 5 x 23 problems using Branch-and-Check. 




Problem Model Size 


Find Opt. Solution 


Prove Opt. Thereafter 


Number Mach. Jobs 


Iter. Nog. MIP sec CLP sec 


Iter. Nog. MIP sec CLP sec 


1 


5 


23 


8 


20 


2.99 


0.07 


7 


18 


6.62 


0.12 


2 


5 


23 


3 


2 


0.09 


0.07 


0 


0 


0.00 


0.00 


3 


5 


23 


19 


51 


3.78 


0.20 


0 


0 


0.00 


0.00 


4 


5 


23 


19 


55 


4.05 


0.19 


0 


0 


0.00 


0.00 


5 


5 


23 


17 


25 


1.79 


0.21 


6 


8 


1.12 


0.14 



The solution process then alternates between solving the optimisation problem to 
optimality and the resulting feasibility problems. If all the feasibility problems are fea- 
sible then the solution is optimal, if not then cuts are added to the optimisation problem 
to exclude that solution and others similar to it. 

This approach bears a striking resemblance to Benders Decomposition. In fact, 
Hooker showed in how this problem can be written for Benders. It was while 
studying this result that the idea of Branch-and-Check took form. We note that the 
correspondence with Branch-and-Check is that the function /, the subproblem part of 
the objective function, is identically zero (see ®), there are no lower bounding cuts 
(see O), there is a simple relaxation of the subproblem (|21T) in the master problem 
(see (HD), and the problem is solved using multiple search trees by adding nogoods 
(see (HD) of the form: 



i£l i£l 

Iain and Grossmann presented very nice computational result in their paper iHS], 
comparing against pure CLP and MIP approaches. While studying the CVRTW problem 
and how Branch-and-Check could be applied to that problem, we wondered what was 
the power of this method. First we were looking at the nogoods, but it turns out that 
the real power of this method lies in the linear relaxation {zB- If it is removed from 
the formulation, problems that are solved in a matter of seconds with the relaxation, 
can be run for more than 24 hours without making any progress. This indicates that 
further research into linear relaxations of the global CLP constraints, i.e., mixed global 
constraints II21I24II . is very important. 

A further study of the results also revealed a signihcant difference in the time it 
took to solve the MIPs vs. the CLPs, up to a factor of 30 times more solving the MIPs. 
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Table 3. Results for 7 x 30 problems using Jain & Grossmann’s approach. 



Problem Model Si^ Find and Prove Opt. Solution 



Number 


Mach. Jobs] 


Iter. Nogoods MIP sec 


CLP sec 


1 


7 


30 


36 


80 


15.15 


1.06 


2 


7 


30 


96 


206 


90.66 


2.78 


3 


7 


30 


115 


225 


116.87 


3.42 


4 


7 


30 


71 


112 


34.94 


2.25 


5 


7 


30 


58 


97 


28.25 


1.92 



Table 4. Results for 7 x 30 problems using Branch-and-Check. 



Problem Model Size 

Number Mach. Jobs 


Find Opt. Solution 

Iter. Nog. MIP sec CLP sec 


Prove Opt. Thereafter 

Iter. Nog. MIP sec CLP sec 


1 


7 


30 


10 


11 


0.83 


0.36 


0 


0 


0.00 0.00 


2 


7 


30 


32 


62 


9.92 


0.98 


0 


0 


0.00 0.00 


3 


7 


30 


8 


11 


0.73 


0.27 


0 


0 


0.00 0.00 


4 


7 


30 


16 


27 


2.55 


0.46 


0 


0 


0.00 0.00 


5 


7 


30 


8 


13 


0.94 


0.24 


0 


0 


0.00 0.00 



This indicated, and is verified by our results, see Tables w that the master problem 
should not necessarily be solved to optimality, instead the CLP subproblems should be 
solved regularly throughout the tree. This result is very intuitive, as we note that the 
CLP subproblem decomposes into problems for each individual machine and hence are 
rather small, compared to the larger MIP master problem that considers all the machines 
at the same time. We also compared our approach on the original data given by Jain and 
Grossmann in [flSl and obtained very favourable results. Most of those instances are, 
however, trivially solved using either method, so we do not include them here. 

We implemented the Branch-and-Check approach for this problem thus, using OPL 
and OPL Script f25l : We halted the MIP master problem when a feasible solution was 
found and solved the CLP subproblems. If any of them were infeasible, we added nogoods 
to the master problem and re-solved. If all were feasible, we recorded that as a new 
“current-best-solution”, constrained the objective function of the master problem and 
re-solved. This process was iterated until the master problem was infeasible, indicating 
that no further solutions could be found given the current bound on the objective function 
and the nogoods posted. 

There is significant overhead with this implementation and redundant calculations: 
We re-start the master problem after adding cuts, instead of continuing from where we 
left off, and thus resolve many similar nodes of the search tree repeatedly. A better tool 
that would allow dynamic modifications of the master problem at each node of the search 
tree would obtain substantially better results. 

4.2 Capacitated Vehicle Routing with Time Windows 

This problem is one of visiting a set of customers using vehicles stationed at a central 
depot; respecting constraints such as the capacity of the trucks, a time window promised 
to each customer, precedence constraints on the customers, etc. The goal is to produce a 
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low cost routing plan, specifying for each vehicle what customers they should visit and 
in what order. Cost is generally proportional to the number of vehicles, the maximum 
time or the total travel time. 

We note that this problem decomposes. Given an assignment of trucks to routes that 
assigns each customer to a specific truck and obeys the capacity constraints, we have 
to sequence each truck by solving a Travelling Salesman Problem with Time Windows 
that satisfies the time window and precedence constraints and minimises our objective 
for each one. 

Using the global cumulative and count constraints and variable index sets we can 
state the problem as follows for Branch-and-Check, minimising the cost of the trucks: 



min y^^Cjyi 




(26) 


ieT 






s.t. tj > Rj , tj + Dj < Sj , 


Vj G C, 


(27) 




Vi G T, 


(28) 


3 1 {^3=i) 






{Vi = 0) ^ count(i, [zi, . . . ,z„],=,0), 


Vi G T, 


(29) 


cumulative((j | (zj = i)), ffc, i?fc, Sfe, 1, 1), 


Vi G T. 


(30) 


|uations (l27t are the time windows, (|28) are the capacity constraints and (l29t 


ensure 



that if a truck is not being used, then no customers are assigned to it. The cumulative 
constraint (1^71 is imposed for each truck and schedules the customers assigned to it. 
The parameters are the customers assigned to the truck, the start time variables, time 
windows and durations of service, the transition times between all pairs of customers, a 
vector of all ones indicating that each customer requires one truck, and hnally that there 
is one truck available. If z has been fixed to z* then the subproblem for each truck i is: 

cumulative((j | {z* = i)), 4, i?fc, 5^, Dfe, 1, 1), (31) 

f' > Rj , t' + D, < Sj , Vj \{z*=i). (32) 

If the subproblem is infeasible then nogoods can be generated to avoid that assignment 
and added to the master problem. Call the accumulated set of those nogoods in the (-th 
iteration Ni{xij). Then we can write the master problem thus as a MIP: 



min CjUi 




(33) 


ieT 


S.t. tj > Rjj tj + Dj < Sj^ 


Vj G C, 


(34) 




Vi G r, 
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We note that the 0-1 variables Xij and ITTh correspond to the general integer variables 
Zj and the index sets {j \ {zj = i)}. 
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We add a dynamic relaxation of the subproblem to the master problem by approxi- 
mating the total travel time as follows: A truck will have to travel to each customer from 
somewhere. Thus if for each customer we find the nearest neighbour, the sum of those 
distances and the services times for the customers assigned to a truck is a lower bound 
on the actual travel time. While solving the master problem some customers will be 
assigned to a particular truck, through the branching. When that happens we can update 
the lower bound, noting that the nearest neighbour can not be among those that have 
been assigned to other trucks. For truck i, let Ai be the set of customers that have been 
assigned to truck i and let Aq be the set of unassigned customers. For truck i G T 
add 




I min 

\q&{AoUAi)\{j} 





< max S„ 

qeAoUAi 



min R„ . 

q&AoUAi 



to the master problem. The sets Ai and the relaxation can be updated based on what 
Xij ’s have been fixed to 1 : 

Set propagation: All customers start in Aq. When Xpq is hxed to 1, then customer q 
moves from Aq to Ap and Xpj , j ^ q, can be fixed to 0. 

Relaxation propagation: We calculate the n x n table of shortest distances and sort 
each list, before solving, so that for each customer there is a list of length n — 1 of the 
other customers in increasing distance order. We then build graph of nearest neighbours. 
Each node has one outgoing arc, the nearest neighbour, and some incoming arcs from 
the nodes that consider it to be their nearest neighbour. The trigger for the propagation 
is when customer q moves from Aq to Ap\ 

Outgoing arc propagation: Customer q may have to revise its choice for nearest neigh- 
bour q* . If q* is in A^., k ^ 0,p, then q must look at its list and hnd the hrst customer 
after q* that is in Aq or Ap. 

Incoming arc propagation: Node q must notify the nodes that consider it to be their 
nearest neighbour. Every such node in Afc, k ^ 0,p, must perform outgoing arc 
propagation, revising its choice for nearest neighbour by looking for the first cus- 
tomer on its list after q that is in Aq or Ak- 



in addition we can add various other valid inequalities to the master problem, such as 
symmetry breaking constraints if the trucks are identical (i.e., same cost and capacity). 
We can require that the hrst stop assigned truck i be less than or equal to the hrst stop 
assigned truck i -f 1 . This can be stated in inequality form as 



E Xij, Vto, n with n <m, Vi G T. 
i=i 

We can also order the trucks by adding constraints of the form yi < (if the number 
of trucks is variable). 



5 Conclusion 

CLP and MIP are approaches that have the potential for integration to beneht the 
solution of combinatorial optimisation problems. In this paper we proposed a mod- 
eller/solver framework, Branch-and-Check, that encompasses both the traditional Ben- 
ders and Branch-and-Bound schemes of MIP as special cases of a spectrum of solution 
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methods and adds an extra dimension by allowing the integration of CLP in a MIP style 
branching search. In particular we have described the relationship between Branch-and- 
Check and Benders. 

We have presented the intuition behind Branch-and-Check, to delay parts of the 
problem, and verified with computational experiments. We have also addressed one 
of the key elements for the integration of CLP and MIP: Dynamic linear relaxations 
of global constraints. The computational results indicate that efficient communication 
between the different parts of a hybrid model requires some double modelling, i.e., the 
same constraint must be present in both CLP and MIP form. Most preferably this double 
modelling should be implicit, i.e., mixed global constraints should post and dynamically 
update a linear relaxation of themselves. This relaxation should be adjustable, i.e, it 
should be possible to efficiently make incremental changes, rather than recompute it at 
every node. 

Indirectly, we have also mentioned the issue of the availability of flexible tools 
for testing integration ideas, of the lack thereof. We conclude that there is pressing 
need in this community to have access to a branching solver that is efficient but also 
highly customisable to allow for customisation of how each node of the search tree is 
processed, solved and propagated, and how the problem is modified at each node both 
when branching and backtracking. 
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Abstract. A difficulty that arises frequently when writing a constraint 
solver is to determine the constraint propagation and simplification al- 
gorithm. In previous work, different methods for automatic generation 
of propagation rules mm and simplification rules |3] for constraints 
defined over finite domains have been proposed. In this paper, we present 
a method for generating rule-based solvers for constraint predicates de- 
fined by means of a constraint logic program, even when the constraint 
domain is infinite. This approach can be seen as a concrete step towards 
Inductive Constraint Solving. 



1 Introduction 

Inductive Logic Programming (ILP) is a machine learning technique that has 
emerged in the beginning of the 90 ’sm . ILP has been defined as the intersection 
of inductive learning and logic programming. It aims at inducing hypotheses 
from examples, where the hypothesis language is the first order logic restricted 
to Horn clauses. To handle numerical knowledge, an inductive framework, called 
Inductive Constraint Logic Programming (ICLP), similar to that of ILP but 
based on constraint logic programming schemes have been proposed [13] . ICLP 
extends ideas and results from ILP to the learning of constraint logic programs. 

In this paper, we propose a method to learn rule-based constraint solvers 
from the definitions of the constraint predicates. We call this approach Induc- 
tive Constraint Solving (ICS). It extends previous works i5TT7|3] where different 
methods for automatic generation of propagation rules for constraints defined 
over finite domains have been proposed. 

In rule-based constraint programming, the solving process of constraints 
consists of a repeated application of rules. In general, we distinguish two 
kinds of rules: simplification and propagation rules. Simplification rules rewrite 
constraints to simpler constraints while preserving logical equivalence, e.g. 
X<Y A Y <X <tA X=Y . Propagation rules add new constraints which are logi- 
cally redundant but may cause further simplification, e.g. X<Y !\Y<Z X<Z. 

* The research reported in this paper has been supported by the Bavarian-French 
Hochschulzentrum. 

T. Walsh (Ed.): CP 2001, LNCS 2239, pp. Sl-gs] 2001. 
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In this paper, we present an algorithm, called PropMiner, that can be 
used to generate propagation rules for constraint predicates defined by means 
of a constraint logic program, even when the constraint domain is infinite. The 
PropMiner algorithm can be completed with the algorithm presented in |4] to 
transform some propagation rules into simplification rules improving both the 
time and space behavior of constraint solving. 

The combination of these techniques can be seen as a true ICS tool. Using 
this tool, the user only has to determine the semantics of the constraints of 
interest by means of their intentional definitions (a constraint logic program), 
and to specify the admissible syntactic form of the rules he wants to obtain. 

Example 1. Consider the following constraint logic program, where 
min{A, B, C) means that C is the minimum of A and B: 

min{A, B,C) <— A<B A C=A. 
min{A^B,C) ^ B<A A C=B. 

For the predicate min, our algorithm PropMiner described in Section |2] 
generates the following propagation rules if the user specifies that the left hand 
side of the rules may consist of min constraints and equality constraints: 

min{A,B,C) C<A A C<B. 
min{A, B, C) A A=B => A=C. 

For example, the second rule means that the constraint min{A, B, C) when 
it is known that the input arguments A and B are equal can propagate the 
constraint that the output C must be equal to the input arguments. 

If the user additionally allows disequality and less-or-equal constraints on the 
left hand side of the rules, the algorithm generates the following rules: 

min{A,B,C) A C^B C=A. 

min{A,B,C) A C^A C=B. 

min{A,B,C) A B<A => C=B. 

min{A,B,C) A A<B C=A. 

Using the algorithm presented in [1] some propagation rules can be trans- 
formed into simplification rules and we obtain the following rule-based constraint 
solver for min: 

min{A, B,C) => C<A A C<B. 

min{A,A,C) <tA A=C. 
min{A,B,C) A C^B => C=A. 

min{A, B,C) A CyfA C=B. 

min{A,B,C) A B<A C=B A B<A. 

min{A,B,C) A A<B C=A A A<B. 

For example, the goal min{A, B, B) will be transformed into B<A using the 
first propagation rule and then the second last simplification rule. □ 



Towards Inductive Constraint Solving 



33 



The generated rules can be directly encoded in a rule-based programming 
language, e.g. Constraint Handling Rules (CHR) to provide a running con- 
straint solver. The Inductive Constraint Solving tool presented in this paper can 
also be simply used as a software engineering tool to help solver developers to 
find out propagation and simplification rules. 

The paper is organized as follows. In Section we present an algorithm to 
generate propagation rules for constraint predicates defined by a constraint logic 
program. In Section |3] we give more examples for the use of our algorithm. We 
discuss in SectionOhow recursive programs can be handled. Finally, we conclude 
with a summary and compare the proposed approach with related work. 



2 Generation of Propagation Rules 

In this section, we present an algorithm, called PropMiner, to generate prop- 
agation rules for constraints using the intensional definitions of the constraint 
predicates. These definitions are given by means of a program in a constraint 
logic programming (CLP) language. We assume some familiarity with constraint 
logic programming as defined by Jaffar and Maher in [H] and follow their defini- 
tions and terminology when applicable. 

The CLP programs are parameterized by a constraint system defined by 
a 4-tuple (If, T>, £, T) and a signature 77 determining the predicate symbols 
defined by a program, if is a signature determining the predefined predicate and 
function symbols, 7? is a L7-structure (the domain of computation), £ is a class 
of Lf-formulas closed by conjunction and called constraints, and T is a first-order 
if-theory that is an axiomatization of the properties of T>. 

We require that I? is a model of T and that T is satisfaction complete with 
respect to £, that is, for every constraint c S £ either T ^ 3c or T ^ ^3c, 
where 3((/)) denotes the existential closure of 4>. Note that these requirements 
are fulfilled by most commonly used CLP languages. 

In the rest of this paper, we use the following terminology. 

Definition 1. A constrained clause is a rule of the form 
77 ^ Hi A ... A A Cl A ... A 

where 77, 77i, . . . , 77„ are atoms over 77 and Ci, . . . , Cm are constraints. A goal is 
a set of atoms over 77 and constraints, interpreted as their conjunction. An an- 
swer is a set of constraints also interpreted as their conjunction. A CLP program 
is a finite set of constrained clauses. The logical semantics of a CLP program P 
is its Clark’s completion and is denoted by P*. 

In programs, goals and answers, when clear from the context, we use upper 
case letters (resp. lower case and numbers) to denote variables (resp. constants). 
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2.1 Rules of Interest 

A propagation pattern is a set of constraints and of atoms over U, interpreted 
as their conjunction. 

A propagation rule is a rule of the form C\ C 2 or of the form C\ false, 
where Ci is a propagation pattern and C 2 is a set of constraints (also interpreted 
as their conjunction). C\ is called the left hand side (Ihs) and C 2 the right hand 
side (rhs) of the rule. A rule of the form Ci => false is called failure rule. To 
formulate the logical semantics of these rules, we use the following notation: let 
V be a set of variables then 3_v(0) denotes the existential closure of (p except 
for the variable in V. 

Definition 2 . A propagation rule {ci, . . . , c„} {di, . . . , d^} is vali^wrt. the 
constraint theory T and the CLP program P iS P* ,T \= /\^a ^ 3_v(Aj dj), 
where V is the set of variables appearing in {ci, . . . , c„}. 

A failure rule {ci, . . . , c„} false is valid wrt. T and P if and only if P*, T ^ 
-3(A,c.). 

To reduce the number of rules which are uninteresting to build a solver, we 
restrict with a syntactic bias the generation to a particular set of rules called 
relevant propagation rules. These rules must contain in their Ihs atoms corre- 
sponding to the predicates on which we want to propagate information, and all 
elements in this Ihs must be connected by common variables. This is defined 
more precisely by the notion of interesting pattern. 

Definition 3. A propagation pattern A is an interesting pattern wrt. a propa- 
gation pattern Baseihs if and only if the following conditions are satisfied: 

1. Baseifis C A. 

2. the graph defined by the relation join a is connected, where joinj^ is a bi- 
nary relation that holds for pairs of elements in A that share at least one 
variable, i.e., join^ = {(ci,C 2 ) | Ci G .4, C 2 € A,Var{{ci})r\Var{{c 2 }) yf 0}, 
where Var{{ci}) and Par ({ 02 }) denote the variables appearing in ci and C 2 , 
respectively. 

A relevant propagation rule wrt. Baseihs is a propagation rule such that its 
Ihs is an interesting pattern wrt. Baseihs- 

2.2 The PropMiner Algorithm 

In this section, we describe the PropMiner algorithm to generate propagation 
rules from a program P expressed in a CLP language determined by {S, V, £, T). 

The algorithm takes as input the program P, a propagation pattern Baseihs 
and a set of constraints Candihs (for which we already have a built-in solver). It 
generates propagation rules that are valid wrt. T and P, relevant wrt. Baseihs 
and such that their Ihs are subsets of Baseihs U Candihs- 

^ The requirement made on CLP programs that T must be satisfaction complete is 
not sufficient to ensure the decidability of the propagation rule validity. However, it 
should be noticed that the soundness of the algorithm proposed in Section r2.2l is not 
based on such a decidability property. 
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begin 

Let TZ be an empty set of rules. 

Let L be a list containing all non-empty subsets 
of Baseihs U Candihs in any order. 

Remove from L any element C which is not an interesting pattern wrt. Baseihs- 
Order L with any total ordering compatible with the subset partial ordering 
(i.e., for all Ci in L if C2 is after Ci in L then C2 5 Zl Ci). 

while L is not empty do 

Let Cihs be the first element of L and then remove Cihs from L. 

Let A be the set of answers for the goal Cihs wrt. the program P. 
if A is empty then 

add the failure rule {Cihs => false) to TZ 

and remove from L each element C such that Cihs C C . 

else 

if A is finite then 

compute the set of constraints Crhs 

as the least general generalization (Igg) of A 

if Crhs is not empty then 

add the rule {Cihs Crhs) to TZ 
endif 
endif 
endif 
endwhile 

output TZ 



end 



Fig. 1. The PropMiner Algorithm 



Principle. From an abstract point of view, the algorithm enumerates each 
possible Ihs subset of Baseihs U Candihs (denoted by Cihs)- For each Cihs it 
computes a set of constraints noted Crhs such that Cihs ^ Crhs is valid wrt. T 
and P and relevant wrt. Baseihs- 

For each Cihs, the algorithm PropMiner determines Crhs by calling the 
CLP system to execute Cihs as a goal and then 

1. if Cihs has no answer then it produces the failure rule Cihs ^ false. 

2. if Cihs has a finite number of answers {Ansi, . . . , AnSn} then let Crhs be 
the least general generalization (Igg) of {Ansi, . . . , Ansn} as defined by [T3] . 
Crhs is then in some sense the strongest constraint common to all answers as 
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illustrated below (see Example |2]) . If Crhs is not empty then the algorithm 
produces the rule C'lhs Crhs- 

It is clear that these two criteria can be used only if all answers can be 
collected in finite time. The application of the algorithm to handle recursive 
programs leading to non-terminating executions is discussed in Section [d] 

The algorithm is given in Figure [231 To simplify its presentation, we consider 
that all possible Ihs are stored in a list. For efficiency reasons the concrete imple- 
mentation is based on a tree and unnecessary candidates are not materialized. 
More details on the implementation are given in Section [2.41 

A particular ordering is used to enumerate the Ihs candidates so that the more 
general Ihs are tried before the more specific ones. Then, we use the following 
pruning criterion which improves greatly the efficiency of the algorithm: if a rule 
Cihs false is generated then there is no need to consider any superset of Cihs 
to form other rule Ihs. 

We now illustrate on the following example the basic behavior of the algo- 
rithm PropMiner. More uses of the algorithm are given in Section 

Example 2. Consider the following CLP program defining p and q: 

p{X,Y,Z) ^ q{X,Y,Z). 

p{X,Y,Z) ^ X<W A Y=W A X>Z. 

q{X,Y,Z) ^ X<a A Y=a A Z^b. 

We use the algorithm to find rules to propagate constraints over propagation 
patterns involving p. Let Baseihs = {p(X,Y, Z)} and let for example Candius 
be the set {X<Z, Y=a, Z=b}. 

When the while loop is entered for the first time we have 

L = { {p{X,Y,Z)}, {p{X,Y,Z),X<Z}, {p{X,Y,Z),Y=a}, 

{p{X,Y,Z), Z^b}, {p{X,Y,Z), X<Z, y=a}, {p{X,Y,Z), X<Z, Z=b}, 
{p{X,Y,Z), Y=a, Z=b}, {p(X,Y,Z), X<Z, Y=a, Z=b} } 

Each element in L is executed in turn as a goal and the corresponding answers are 
collected and used to build a rule rhs. For example, {p{X, Y, Z), Z=b} leads to a 
single answer Ansi = {X<W, Y=W, X>Z, Z=b}. The Igg is simply Ansi itself 
and we have the propagation rule {p{X,Y, Z), Z—b} ^ {X<W, Y =W, X>Z, 
Z=b}. For {p{X^ Y, Z),X<Z} we have again a single answer {Al<a, Y =a, Z^b, 
X<Z} and thus also a trivial Igg producing the rule {p{X,Y, Z), X<Z} 
X<a,Y=a,Zj^b,X<Z}. 

For the goal {p(X,Y,Z), Y=a}, the situation is different since we 

have the two following answers Ans\ = {X<a,Y=a, Z^b} and Ans 2 = 
{X<a^Y=a,X>Z}. The Igg which is based on a syntactical generalization is 
{X<a,Y=a} and we have the rule {p{X,Y, Z),Y=a} => {X<a,Y=a}. 

The situation may be more tricky. For example, the goal {p{X,Y, Z)} have 
two answers Ansi = {X<a, Y=a, Zy^b} and Ans 2 = {X<W, Y=W, X>Z} 
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having no common element. Fortunately, the Igg corresponds in some sense to 
the least upper bound of {Ansi, Ans 2 } wrt. the 0-subsumption ordering |15] 
(more precisely it represents the equivalence class of constraints that corresponds 
to this least upper bound). Thus, the Igg of {Ansi,Ans 2 } is {X<E, Y=E}, 
where E is a, new variable, and the algorithm produces the rule {p{X, V, Z)} ^ 
{X<E,V=Ej. However, it should be noticed that the notion of Igg is not based 
on the semantics of the constraints in the set of answers. Thus, two sets of 
answers that are equivalent wrt. the constraint theory but not identical from 
a syntactic point of view will lead in general to different Igg’s. As shown in 
sections E31 and El the user can partially overcome this difficulty by providing 
ad hoc propagation rules to take into account the constraint semantics. 

The effect of the pruning criterion is straightforward. The goal 
G = {p{X,Y,Z), X<Z, Z=b} has no answer and leads to the rule 
{p{X,Y,Z), X<Z, Z=b} => false. Then the element {p{X,Y,Z), X<Z, 
Y =a, Z=b} that is a super set of G is simply removed from L and will not 
be considered to generate any rule. 

Properties. It is straightforward to see that the algorithm is complete in the 
sense that if Cihs fk Baseihs U Gandihs is an interesting pattern wrt. Baseihs 
and there is no C C Cihs such that G false is valid, then Gihs is considered 
by the algorithm as a candidate to form the Ihs of a rule. 

To establish the soundness of the algorithm, we need the following results 
presented in j9]. 

Theorem 1. Let P be a program in the CLP language determined by 
{S.'D, C,T), where P is a model of T. Suppose that T is satisfaction complete 
wrt. £, and that P is executed on a CLP system for this language. Then: 

1. If a goal G has a finite computation tree, with answers ci,...,c„ then 
P* ,T 1= G ^ 3_v(ci V ... V c„), where V is the set of variables appear- 
ing in G. 

2. If a goal G is finitely failed for P then P*,T \= ^G. 

The soundness of PropMiner is stated by the following theorem. 

Theorem 2 (Soundness). The PropMiner algorithm produces propagation 
rules that are relevant wrt. Baseihs and valid wrt. T and P. 

Proof. All Gihs considered are interesting pattern wrt. Baseihs, thus only rel- 
evant rules can be generated. If a rule of the form Cihs false is produced 
then by property 2 in Theorem [T] this rule is valid. Suppose a rule of the 
form Cihs Crhs is generated. Then Crhs is the Igg of a finite set of answers 
{Ansi, . . . , AnSn} obtained by the execution of the goal Cihs on the program P. 

By property 1 in Theorem |TJ we have P*,T ^ Cihs ^ 3_v(Ansi V . . . V 
Ansn), where V is the set of variables appearing in Cihs- Since Crhs is the Igg of 
{Ansi , . . . , AnSn} then by [1 5j we know that Ansi V ... V Ansn —>■ Crhs- Thus 
P*,T ^ Cihs 3_v Crhs, i-e. Cihs Crhs is valid wrt. T and P. □ 
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2.3 Interesting Rules for Constraint Solvers 

The basic form of the PropMiner algorithm given in Figure 12.21 produces a 
very large set of rules. Most of these rules are redundant (partly or completely) 
or propagates too weak constraints or on the contrary propagates too many 
stronger constraints (inflating considerably the constraint store at runtime) and 
thus may be of little interest to built a constraint solver. 

We present in this section mandatory complementary processing that is in- 
tegrated in the basic algorithm in order to generate rules of practical interest 
wrt. solver construction. 

Consider again the CLP program of example E Let Baseihs = {p{X,Y, Z)} 
and let us use a richer set of constraints to form the Ihs of the rules Candihs = 
{X<Z, Y<X, X=Z,Y=Z, X=b, Y=a, Z=b}. 

Among the rules generated by the basic algorithm PropMiner, we have: 



{p{X,Y,Z)}^{X<E,Y=E}. (1) 

{p{X,Y,Z), X<Z} {X<a, Y=a, Z^b, X<Z}. (2) 

{p{X,Y,Z), Y<X} => {X<E, Y=E, Y<X}. (3) 

{p\x,Y,Z), X=Z} ^ {X<a, Y=a, Z^b, X=Z}. (4) 

\p{X,Y,Z), Y=Z} ^ {X<a, Y=a, Z^b, Y=Z}. (5) 

{p{X,Y,Z), X=b} ^ {X<E, Y=E, X=b}. (6) 

{p{X, Y, Z), Y=a} => {X<E, Y=E, Y=a}. (7) 

{p{X,Y,Z), Z=b} ^ {X<W, Y=W, X>Z, Z=b}. (8) 

{p{X,Y,Z), X<Z, Z=b} => false. (9) 



Since the algorithm only imposes that the exploration ordering is a total 
ordering compatible with the subset ordering on the Ihs, the real order of the 
rules generated may be slightly different according to implementation choices 
(see Section 123). However, the specific processing presented in this section can 
still be applied. 

Removing redundancy. The key idea of the simplification is to remove from 
the rhs of a rule R all constraints that can be derived from the Ihs of R using the 
built-in solvers and the rules already generated. If the remaining rhs is empty 
then the whole rule can be suppressed. 

For example, according to this process rule (6) is removed because its rhs is 
fully redundant wrt. its Ihs and wrt. rule (1). For rule (2) only the rhs is modified 
and becomes {X<a, Y=a, Z^b}, since X<Z is trivially entailed by the Ihs of 
the rule. 

Depending on the behavior of the built-in solvers, rule (4) may be only trans- 
formed into {p(AT, Y,Z), X=Z} {X<a, Y=a, Z^b} while if we know the 
semantics of < we may use rule (2) to derive the same constraints. If the built-in 
solver does not allow to discover this redundancy, then in our implementation 
(see Section imi the user can add in a simple way propagation rules to derive ex- 
plicitely logical consequences of the built-in constraints. In this example, one of 
the complementary rules that can be provided by the user is {X=Z} => {X<Z} 
which allows to find that rule (4) is then fully redundant wrt. rule (2). 
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This simplification process also applies to failure rules. Suppose that the 
built-in solver is able to detect that Z=b f\ Z^h is inconsistent, then the rule (9) 
is removed since it is redundant wrt. rule (2). 



Generating stronger rhs. If we consider rule (6) {p{X,Y, Z), X=b} 
{X<E, Y=E, X=b} the rhs constructed from the least general generalization 
of the answers obtained for the goal {p{X,Y, Z), X=b} is in some sense too 
general. The execution of the goal gives two answers. One containing {Z^b} 
and the other {X>Z^ X=b}. From a semantical point of view, this leads clearly 
to Z^b in both cases, but the least general generalization is mainly syntactical 
and do not retains this information. 

If we want a richer rhs (containing Z^b) then we must have at hand a (built- 
in) solver that propagates {Z^b} also in the second answer. If we do not have 
such a solver, then here again the user can provide himself complementary prop- 
agation rules (in this example the single rule {X>Y} => {X^Y}) to produce 
this piece of information. 



Projecting variables. For efficiency reasons in constraint solving it is partic- 
ularly important to limit the number of variables. 

Then a rule like {p{X,Y, Z)} {X<E, Y=E} should be avoided since it 

will create a new variable each time it is fired. 

So, we simply project out such useless variables in the following way. We 
consider in turn each equality in the rhs of a rule. If this equality is of the 
form E=E or F=E where E and F are variables and E does not appear in the 
Ihs of the rule, then we suppress this equality from the rhs and we apply the 
substitution transforming E into F to the whole remaining rhs. 

More subtle situations may arise. Suppose that the second clause of the pro- 
gram given in example [2] was p{X, Y, Z) ^ X<W A Y=W A Z^a. Then, the 
first rule generated would have been {p(X, Y, Z)} => {X<E, Y=E, Z^F}. And 
then projecting out E would transform it into {p{X,Y, Z)} {X<Y, Z^F}. 

Then, during constraint solving the application of this rule will add to the store 
the constraint Z^F, where A is a new variable. This phenomena leads in gen- 
eral to a rather inefficient solving process. So, we propose the following optional 
treatment: When all other previous processing has been performed (simplifica- 
tion, additional propagation and projection of variable in equalities) the user can 
choose to apply a strict range restriction criteria: all constraints in the rhs con- 
taining a variable that does not appear in the Ihs is removed (e.g., Z^F in the 
previous rule) . This range restriction criteria is applied in all examples presented 
in this paper. However, it should be noticed that this process remains optional 
since this simplification criteria is purely syntactic and does not guarantee that 
the constraints removed from the rhs are semantically redundant, and thus may 
produce weaker rules (although still valid). 
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2.4 Implementation Issues 

The key aspects of our implementation of the PropMiner algorithm are pre- 
sented in this section. The prototype has been developed under SICStus Prolog 
3.7.1. It is written in Prolog and takes advantage of the rule-based programming 
language Constraint Handling Rules (CHR) supported in this environment. 

Using CHR. The CHR language facilitates in two ways the implementation 
of the important processing described in Section 031 Firstly, we can use the rules 
generated as CHR rules and then run CHR to decide if a rule propagates new 
constraints wrt. the rules we have already. Secondly, the user can directly add 
new rules to perform complementary propagations wrt. the built-in solvers as 
mentioned in Section 12.31 

Clause encoding. It should be noticed that in this environment the equality 
= is reserved to specify unification. So in practice, we use another binary pred- 
icate to denote the equality constraint. Moreover, the bindings of the variables 
due to the resolution steps are not handled explicitely as equalities in the store. 
Suppose that the third clause of the program given in example [2] was written 
under the form q{X, a, Z) <— X<at\Z^b. Then, for the goal {p{X, Y, Z), X<Z} 
we may have not collected the constraint Y=a explicitely and thus V=a will not 
appear in the rhs of rule (2). Thus, we simply preprocess the clauses so that the 
atom in the head of a clause does not contain functors (including constants) and 
coreferences. The corresponding functors and coreferences are simply encoded 
by equality constraints in the body of the clause. For example a head of the form 
p{X,a,X) will be transformed into p{X,Y,Z) and X=Z A Y=a will be added 
to the body. 

Enumeration of Ihs. The PropMiner algorithm enumerates the possible 
Ihs (the elements in L) . The implementation of this enumeration is based on the 
exploration of a tree corresponding to the Ihs search space. This tree is explored 
using a depth first strategy. As in |3], the branches are expanded using a partial 
ordering on the Ihs candidates such that the more general Ihs are examined 
before more specialized ones. The partial ordering used in our implementation 
is the 0-subsumption ordering m- 



3 Practical Uses of PropMiner 

In this section, we show on examples that a practical application of our approach 
lies in solver development. All the set of rules presented in this section have been 
generated in a few seconds on a PC Pentium 3 with 128 MBytes of memory and 
a 500 MHZ processor. 

For convenience, we introduce the following notation. Let c be a con- 
straint symbol of arity 2 and Di and D 2 be two sets of terms. We define 
atomic{c, Di, D 2 ) as the set of all constraints built from c over Di x U 2 . More 
precisely, 

atomic{c, Di, D 2 ) = {c(a,/3) \ a G Di and j3 G U 2 }. 
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Example 3. For the minimum predicate min{A,B,C) defined by the CLP pro- 
gram of Example [T] the PropMiner algorithm with the following input 

Baseihs = {min{A,B,C)} 

Candihs = atomic{=, {A, B, C}, {A, B, C}) U 
atomic{y^, {A, B^C}, {A^ B^C}) U 
atomic{<, {A, B, C}, {A, B, C}) 

generates the 6 propagation rules presented in Example [T] 

It should be noticed that to be able to generate the first rule, the following 
rules for equality and less-or-equal constraints have to be present in the built-in 
solver to ensure the generation of stronger rhs (as illustrated in Section 12.31) : 



x<y A ^ x<z. 

X=Y ^ X<Y. 

If these rules are not already in the built-in solver, in our implementation 
the user can provide them very easily by means of CHR rules (see Section . 
Moreover, using this possibility, PropMiner can incorporate additional knowl- 
edge given by the user about the predicate of interest. For example, the user can 
express the symmetry of min with respect to the the first and second arguments 
by the rule: 



min{A,B,C) min{B, A,C). 

If this rule is provided by the user as a CHR rule, it completes the built-in solver 
and then the PropMiner algorithm generates only the following simplified set 
of 4 rules: 



min{A, B, C) 
min{A, B,C) A A=B 
min{A,B,C) A C^B 
min{A,B,C) A B<A 



C<A A C<B. 
A=C. 

C=A. 

C=B. 



Example 4- If we consider the maximum predicate max, a set of rules similar to 
the rules for min is generated by PropMiner. Then the user has the possibility 
to add these two sets of rules to the built-in solver and to execute PropMiner 
to generate interaction rules between min and max. This execution is performed 
with the following input 

Baseihs = {min{A, B, C) A max{D, E, F)} 

Candihs = atomic{^, {A, B, C}, {D, E, F}) 



and a CLP program consisting of the definitions of min and max. Since the 
propagation rules specific to min and max alone have been added to the built-in 
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solver, PropMiner takes advantage of these rules to simplify many redundan- 
cies. Thus only 10 propagation rules specific to the conjunction of min with max 
are generated. Examples of rules are: 



min{A, B,C) A max{D,E,F) A C^E A C^D 
min{A, B,C) A max{D,E,F) A B^D A A^D 
min{A, B,C) A max{D,E,F) A C^E A B^D A A^F 
min{A, B,C) A max{D, E, F) A C^D A B^F A A^E 



F^C. 

D^C. 

F^C. 

F^C. 



4 Handling Recursive Constraint Definitions 

In this section, we show informally that the algorithm PropMiner can be ap- 
plied when the CLP program P defining the constraint predicates is recursive 
and may lead to non-terminating executions. 

As presented in Figure EI2] for each possible rule Ihs in L (denoted by Cihs) 
the algorithm needs to collect in finite time all answers to the goal Cihs wrt. the 
program P. In general, we cannot guarantee such a termination property, but 
we can use standard Logic Programming solutions developed to handle recursive 
clauses. For example, we can prefer a resolution based on the OLDT m scheme 
that ensures finite refutations more often than a resolution following the SLD 
principle (e.g., with the OLDT resolution the execution always terminates for 
Datalog programs). 

We can also decide to bound the depth of the resolution to stop the execu- 
tion of a goal that may cause non-termination. In this case, if the execution of 
goal Cihs has a resolution depth exceeding a given threshold, we interrupt this 
execution and proceed with the next possible Ihs in L. Of course this strategy 
may be too restrictive, in the sense that it may stop too early some terminating 
executions and thus may avoid the generation of some interesting rules. 

Example 5. Consider the well-known ternary append predicate for lists, which 
holds if its third argument is a concatenation of the first and the second argu- 
ment. It is usually implemented by these two clauses: 



appendix, Y,Z) ^ Ar=[] A F=Z. 

appendix, Y,Z) ^ X=[H\Xl] A Z=[H\Z1] A appendiXl,Y, Zl). 



Then, if we bound the resolution depth to discard non-terminating executions, 
the algorithm PropMiner terminates and using the appropriate input produces, 
among others, the following rules: 



appendiA, B , C) A A=B A C=[D\ 
appendiA, B,C) A B=C A C=[D] 
appendiA, B,C) A C=[] 
appendiA, B,C) A A=[] 



false. 

A=[]. 

B=[] A A=[]. 
B=C. 
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5 Conclusion and Related Work 



We have presented an approach to generate rule-based constraint solvers from 
the intentional definition of the constraint predicates given by means of a CLP 
program. The generation is performed in two steps. In a first step, it produces 
propagation rules using the algorithm PropMiner described in Section [21 and 
in a second step it transforms some of these rules into simplification rules using 
the method proposed in [1]. 

Now, we briefly compare our work to other approaches and give directions 
for future work. 

- In [WP] first steps towards automatic generation of propagation rules have 
been done. In these approaches the constraints are defined extensionally over 
finite domains by e.g. a truth table or their solution tuples. Thus, this pa- 
per can be seen as an extension of these previous works towards constraints 
defined intensionally over infinite domains. Over finite domains the algo- 
rithm PropMiner, can be used to generate the rules produces by the other 
methods. 



Example 6. For the boolean negation neg{X, Y), the algorithm PropMiner 
and the algorithm described in [2| generate the same rules: 



neg{X,X) false. 
neg{X,l) =» X=0. 
neg{X,0) => X=l. 
neg{l,Y) => F=0. 
neg{0,Y) => F=l. 



— Generalized Constraint Propagation | 16| extends the propagation mechanism 
from finite domains to arbitrary domains. The idea is to find and propagate 
a simple approximation constraint that is a kind of least upper bound of a 
set of computed answers to a goal. In contrast to our approach where the 
generation of rules is done once at compile time, generalized propagation is 
performed at runtime. 

— Constructive Disjunction [HEol is a way to extract common information from 
disjunctions of constraints over finite domains. We are currently investigat- 
ing how constructive disjunction can be used in our case to enhance the 
computation of the least upper bound of set of answers in the case of con- 
straints over finite domains. One advantage is that this approach can collect 
more information since it takes into account the semantics of the arithmetic 
operators, comparison predicates, and interval constraints. 

— In ILP [T2] and ICLP [13111111)118] . the user is interested to find out logic 
programs and CLP programs from examples. In our case, we generate con- 
straint solvers in the form of propagation and simplification rules, using the 
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definition of the constraint predicates given by means of a CLP program. 
We used techniques also used in ILP and ICLP (e.g., [15]), and it is impor- 
tant to consider which of the works done in these fields may be used for the 
generation of constraint solvers. 

To our knowledge, the work done on Generalized Constraint Propagation, 
Constructive Disjunction, and in the fields of ILP and ICLP have not previously 
been adapted or applied to the generation of rule-based constraint solvers. 

Future work includes the extension of the algorithm PropMiner to generate 
more information to be propagated in the right hand side of the rules. In the 
current algorithm, the computation of the least upper bound of set of answers 
is based on [T^ which does not rely on the semantics of the constraints in the 
answers. As illustrated in Section 12.31 and Section the user can provide by 
hand propagation rules to take into account (partially) this semantics, but, as 
it has been pointed out to us, approaches like [M] can be used to embed this 
semantics in a more general way and directly in the computation of the least 
upper bound. Another complementary aspect that needs to be investigated is 
the completeness of the solvers generated. It is clear that in general this property 
cannot be guaranteed, but in some cases it may be possible to check it, or at 
least to characterize the kind of consistency the solver can ensure. 
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Abstract. Although constraint programming offers a wealth of strong, general- 
purpose methods, in practice a complex, real application demands a person who 
selects, combines, and refines various available techniques for constraint satis- 
faction and optimization. Although such tuning produces efficient code, the 
scarcity of human experts slows commercialization. The necessary expertise is 
of two forms: constraint programming expertise and problem-domain expertise. 
The former is in short supply, and even experts can be reduced to trial and error 
prototyping; the latter is difficult to extract. The project described here seeks to 
automate both the application of constraint programming expertise and the ex- 
traction of domain-specific expertise. It applies FORK, an architecture for 
learning and problem-solving, to constraint solving. FORK develops expertise 
from multiple heuristics. A successful case study is presented on coloring 
problems. 



1 Introduction 

Difficult constraint programming problems require human experts to select, combine 
and refine the various techniques currently available for constraint satisfaction and 
optimization. These people “tune” the solver to fit the problems efficiently, but the 
scarcity of such experts slows commercialization of this successful technology. The 
few initial efforts to automate the production of specialized software have thus far fo- 
cused on choosing among methods or constructing special purpose algorithms [1-4]. 

Although a properly-touted advantage of constraint programming is its wealth of 
good, general-purpose methods, at some point complex, real applications require hu- 
man expertise to produce a practical program. This expertise is of two forms: con- 
straint programming expertise and problem domain expertise. The former is in short 
supply, and even experts can be reduced to trial and error prototyping; the latter is dif- 
ficult to extract. This project seeks to automate both the application of constraint pro- 
gramming expertise and the extraction of domain-specific expertise. 

Our goal is to automate the construction of problem- specific or problem-class- 
specific constraint solvers with a system called ACE (Adaptive Constraint Engine). 
ACE is intended to support the automated construction of such constraint solvers in a 
number of different problem domains. Each solver will incorporate a learned, collabo- 
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rative “community” of heuristics appropriate for their problem or problem class. Both 
the way in which they collaborate and some of the heuristics themselves will be 
learned. 

This paper reports initial steps toward that goal in the form of a case study that ap- 
plies FORR, a well-tested, collaborative, problem-solving architecture, to a subset of 
constraint programming: graph coloring. The FORR architecture permits swift estab- 
lishment of a well-provisioned base camp from which to explore this research frontier 
more deeply. Section 2 presents some minimal background, including a description of 
FORR. Section 3 presents the initial, successful case study. Section 4 outlines further 
opportunities and challenges. Section 5 is a brief conclusion. 



2 The Problem 

We provide here some minimal background information on CSP’s and on the FORR 
(FOr the Right Reasons) architecture. Further details will be provided on a need-to- 
know basis during our description of the case study. 

2.1 CSP 

Constraint satisfaction problems involve a set of variables, a domain of values for 
each variable, and a set of constraints that specify which combinations of values are 
allowed [5-8]. A solution is a value for each variable, such that all the constraints are 
satisfied. For example, graph coloring problems are CSP’s: the variables are the graph 
vertices, the values are the available colors, and the constraints specify that neigh- 
boring vertices cannot have the same color. The basic CSP paradigm can be extended 
in various directions, for example to encompass optimization or uncertainty. Solution 
methods generally involve some form of search, often interleaved with some form of 
inference. 

Many practical problems - such as resource allocation, scheduling, configuration, 
design, and diagnosis - can be modeled as constraint satisfaction problems. The tech- 
nology has been widely commercialized, in Europe even more so than in the U.S. 
This is, of course, an NP-hard problem area, but there are powerful methods for solv- 
ing difficult problems. Artificial intelligence, operations research, and algorithmics all 
have made contributions. There is considerable interest in constraint programming 
languages. Although we take an artificial intelligence approach, we expect our results 
to have implications for constraint programming generally. 

Constraint satisfaction problem classes can be defined by “structural” or “seman- 
tic” features of the problem. These parameterize the problem and establish a multidi- 
mensional problem space. We will seek to synthesize specialized solvers that operate 
efficiently in different portions of that space. 



2.2 FORR 

FORR is a problem-solving and learning architecture for the development of expertise 
from multiple heuristics. It is a mixture of experts decision maker, a system that com- 
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bines the opinions of a set of procedures called experts to make a decision [9, 10]. 
This approach is supported hy evidence that people integrate a variety of strategies to 
accomplish problem solving [11-13]. 

A FORR-based artifact is constructed for a particular set of related tasks called a 
domain, such as path finding in mazes [14] or game playing [15]. A FORR-based 
program develops expertise during repeated solution attempts within a problem class, 
a set of problems in its domain (e.g., contests at the same game or trips with different 
starting and ending points in the same maze). 

FORR-hased applications have produced expert-level results after as few as 20 ex- 
periences in a problem class. Learning is relatively fast because a FORR-based appli- 
cation begins with prespecified, domain-specific knowledge. To some extent, a 
FORR-based application resembles a person who already has substantial general ex- 
pertise in a domain, and then develops expertise for a new problem class. Such a per- 
son is already aware of general principles that may support expert behavior, and also 
recognizes what is important to learn about a new class, how to acquire that informa- 
tion, and how to apply it. In FORR, that information is called useful knowledge, and 
the decision principles are called Advisors. FORR learns weights to reflect the reli- 
ability and utility of Advisors. 

Useful knowledge is knowledge that is possibly reusable and probably correct. In 
path finding, for example, a dead-end is a particular kind of useful knowledge, an 
item. Each item of useful knowledge is expected to be relevant to every class in the 
domain. The values for a particular useful knowledge item, however, are not known 
in advance; dead-ends, for example, must be learned, and they will vary from one 
maze to another. This is what is meant by problem-class-specific useful knowledge. 

A FORR-based program learns when it attempts to solve a problem, or when it ob- 
serves an external expert solve one. The program is provided in advance with a set of 
useful knowledge items. Each item has a name (e.g., “dead-end”), a learning algo- 
rithm (e.g., “detect backing out”), and a trigger (e.g., “learn after each trip”). A 
learning trigger may be set for after a decision, after a solution attempt, or after a se- 
quence of solution attempts. When a useful knowledge item triggers, its learning algo- 
rithm executes, and the program acquires problem-class-specific useful knowledge. 
Note that there is no uniform learning method for useful knowledge items — in this 
sense, EORR truly supports multi-strategy learning. 

FORR organizes Advisors into a hierarchy of tiers (see Figure 1), based upon their 
correctness and the nature of their response. A FORR-based program begins with a 
set of prespecified Advisors intended to be problem-class-independent, that is, rele- 
vant to most classes in the domain. Each Advisor represents some domain-specific 
principle likely to support expert behavior. 

Each Advisor is represented as a time-limited procedure that accepts as input the 
current problem-solving state, the legal actions from that state, and any useful knowl- 
edge that the program has acquired about the problem class. Each Advisor produces 
as output its opinion on any number of the current legal actions. An opinion is repre- 
sented as a comment, of the form <strength, action, Advisor> where strength is an in- 
teger in [0, 10]. A comment expresses an Advisor’s support for (strength > 5), or op- 
position to (strength < 5), a particular action. Comments may vary in their strength, 
but an Advisor may not comment more than once on any action in the current state. 




Collaborative Learning for Constraint Solving 49 




Fig. 1. How the FORK architecture organizes and manages Advisors to make a decision. PWL 
produces the weights applied for voting. 



Our work applies FORR to CSP. To apply FORR to a particular application do- 
main, one codes definitions of problem classes and useful knowledge items, along 
with algorithms to learn the useful knowledge. In addition, one postulates Advisors, 
assigns them to tiers, and codes them as well. Effective application of the architecture 
requires a domain expert to provide such insights. The feasibility study of the next 
section was generated relatively quickly, within the framework of Figure 1 . The fu- 
ture work outlined in Section 4, however, is expected to require substantial changes to 
FORR. 
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3 Case Study 

This case study on graph coloring is provided to introduce the basic approach we are 
pursuing, and to demonstrate its potential. We understand that there is a vast literature 
on graph coloring; we do not wish to give the erroneous impression that we believe 
that this study makes a serious contribution to it. 



3.1 GC, the Graph Colorer 

Graph Colorer (GC) is a FORR-based program for the specific CSP problem domain 
of graph coloring. We developed it as a proof of concept demonstration that the 
FORR architecture is a suitable basis for a research program aimed at learning col- 
laborative algorithms attuned to classes of similar problems. GC includes only a few 
Advisors and learns only weights, but its results are quite promising. For GC, a prob- 
lem class is the number of vertices and edges in a c-colorable graph, and a problem is 
an instance of such a graph. For example, a problem class might be specified as 4- 
colorable on 20 vertices with 10% edge density. (“Percentage edge density” here ac- 
tually refers to percentage of possible edges above a minimal n-1, in this case 10% 
edge density means 19 H- 17 = 36 edges.) A problem in that class would be a particular 
4-colorable graph on 20 vertices with 36 edges. Problems are randomly generated, 
and are guaranteed to have at least one solution. There are, of course, a great many 
potential graphs in any given problem class. 

GC basically simulates a standard CSP algorithm, forward checking. A world state 
for GC is a legally, partially (or fully) colored graph. On each iteration, GC either se- 
lects a vertex to color or, if a vertex has already been selected, colors it. Color selec- 
tion is random. Our objective was to have GC learn an efficient way to select the ver- 
tices. In CSP terms, we wanted to acquire an efficient variable ordering heuristic [16- 
19]. After a color is chosen for a vertex, that color is removed from the domain of 
neighboring vertices. If after a coloring iteration, some vertex is left without any legal 
colors, then the state is automatically transformed by retracting that coloring and re- 
moving it from the legal colors that vertex may subsequently assume. If necessary, 
vertices can be “uncolored” to simulate backtracking. Thus, given enough time and 
space, GC is complete, that is, is capable of finding a solution. 

Figure 2 shows how FORR has been applied to produce GC. GC has two tier-1 Ad- 
visors. In tier 1, FORR maintains a presequenced list of prespecified, always correct 
Advisors, denoted by A^... in Figure 1. A FORR-based artifact begins the decision 
making process there, with the current position, the legal actions from it, and any use- 
ful knowledge thus far acquired about the problem class. When a tier-1 Advisor 
comments positively on an action, no subsequent Advisors are consulted, and the ac- 
tion is executed. When a tier- 1 Advisor comments negatively on an action, that action 
is eliminated from consideration, and no subsequent Advisor may support it. If the set 
of possible actions is thereby reduced to a single action, that action is executed. 

GC’s two tier-1 Advisors are Victory and Later. If only a single vertex remains un- 
colored and that vertex has been selected and has at least one legal coloring. Victory 
colors it. If an iteration is for vertex selection, Later opposes coloring any vertex 
whose degree is less than the number of colors that could legally be applied to it, on 
the theory that consideration of such a vertex can be delayed. Typically with FORR, 
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legal 
actions 




Fig. 2. GC’s decision structure is a version of Figure 1. Additional tier-3 Advisors may be 
added where indicated. 

the first tier does not identify an action, and control passes to tier 2, denoted by 
in Figure 1. Tier-2 Advisors plan, and may recommend sequences of actions, 
instead of a single action. GC does not yet incorporate tier-2 Advisors. 

If neither the first nor the second tier produces a decision, control passes to tier 3, 
denoted by A^^j...A^ in Figure 1. In FORR, all tier-3 Advisors are heuristic and con- 
sulted in parallel. A decision is reached by combining their comments in a process 
called voting. When control resorts to tier 3, the action that receives the most support 
during voting is executed, with ties broken at random. Originally, voting was simply a 
tally of the comment strengths. Because that process makes tacit assumptions that are 
not always correct, voting can also be weighted. 

GC has nine tier-3 Advisors, eight of which encapsulate a single primitive, naive 
approach to selecting a vertex. Random Color is the only coloring Advisor, so GC al- 
ways selects a legal color for a selected vertex at random. Each of the remaining tier-3 
Advisors simply tries to minimize or maximize a basic vertex property. Min Degree 
supports the selection of uncolored vertices in increasing degree order with comment 
strengths from 10 down. Max Degree is its dual, rating in decreasing degree order. 
Min Domain supports the selection of uncolored vertices in increasing order of the 
number of their current legal colors, again with strengths descending from 10. Max 
Domain is its dual. Min Forward Degree supports the selection of uncolored vertices 
in increasing order of their fewest uncolored neighbors, with strengths from 10 down. 
Max Forward Degree is its dual. Min Backward Degree supports the selection of un- 
colored vertices in increasing order of their fewest colored neighbors, with strengths 
from 10 down. Max Backward Degree is its dual. The use of such heuristic, rather 
than absolutely correct, rationales in decision making is supported by evidence that 
people satisfice, that is, they make decisions that are good enough [20]. Although 
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satisficing solutions are not always optimal, they can achieve a high level of expertise. 
See, for example, [21]. 

Arguably these eight properties are simply the most obvious properties one could 
ascribe to vertices during the coloring process, making it all the more remarkable that 
the experiments we carried out were able to use them to such good effect. They also 
correspond naturally to properties of the “constraint graph” and “search tree” associ- 
ated with general CSP’s, providing additional resonance to the case study. Of course, 
a skeptical reader might be concerned that, consciously or not, we have “biased” our 
set of Advisors here. Even if that were so, we would respond that it is still up to 
FORR to learn how to use the Advisors appropriately, and that the ability to incorpo- 
rate our expertise into the FORR architecture by specifying appropriate Advisors is a 
feature, not a bug. 

Although a FORR-based program begins with a set of problem-class-independent, 
tier-3 Advisors, there is no reason to believe that they are all of equal significance or 
reliability in a particular problem class. Therefore, FORR uses a weight-learning algo- 
rithm called PWL {Probabilistic Weight Learning) to learn problem-class-specific 
weights for its tier-3 Advisors. The premise behind PWF is that the past reliability of 
an Advisor is predictive of its future reliability. 

Initially, every Advisor has a weight of .05 and a discount factor of .1. Each time 
an Advisor comments, its discount factor is increased by .1, until, after 10 sets of 
comments, the discount factor reaches 1.0, where it remains. Early in an Advisor’s 
use, its weight is the product of its learned weight and its discount factor; after 10 sets 
of comments, its learned weight alone is referenced. In tier 3 with PWL, a FORR- 
based program chooses the action with the greatest support: 



argmax 

j 



I ^ 0)iWiSij\ 



d = number of opinions i has generated 
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0 ), = 



Q.\*d\id< 10 
1 otherwise 



If an Advisor is correct, its wisdom will gradually be incorporated. If an Advisor is 
incorrect, its weight will diminish as its opinions are gradually introduced, so that it 
has little negative impact in a dynamic environment. 

During testing, PWL drops Advisors whose weights are no better than random 
guessing. This threshold is provided by a non-voting tier-3 Advisor called Anything. 
Anything comments only for weight learning, that is, it never actually participates in a 
decision. Anything comments on one action 50% of the time, on two actions 25% of 
the time, and in general on n actions (0.5) % of the time. Each of Anything’s com- 
ments has a randomly-generated strength in {0, 1, 2, 3, 4, 6, 7, 8, 9, 10}. An Advi- 
sor’s weight must be at least .01 greater than Anything’s weight to be consulted dur- 
ing testing. During testing, provisional status is also eliminated (i.e., co. is set to 1), to 
permit infrequently applicable but correct Advisors to comment at full strength. In 
summary, PWL fits a FORR-based program to correct decisions, learning to what 
extent each of its tier-3 Advisors reflects expertise. Because problem-class-specific 
Advisors can also be acquired during learning, PWL is essential to robust perform- 
ance. 

To get some sense of how GC behaves, consider the partially 3-colored graph in 
Figure 3(a). (The graph was used in a different context in [22].) Six of the vertices are 
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Fig. 3. Two partially 3-colored graphs. 

colored, and the next vertex should now be selected for coloring. Since vertex 6 has 
no legal color, however, the most recently selected vertex will be tmcolored. Now 
consider the partially colored graph in Figure 3(b). Since the number of possible col- 
ors for vertex 12 is 2, Later will eliminate vertex 12 as an immediate choice for col- 
oring, and the remaining uncolored vertices will be considered by tier 3. For example, 
Min Degree would support the selection of vertex 11 with a strength of 10, and the 
selection of vertices 9 and 10 with a strength of 9. Similarly, Max Backward Degree 
would support the selection of vertices 9 and 10 with a strength of 10, and vertices 6 
and 1 1 with a strength of 9. When the comments from all the tier-3 Advisors are tal- 
lied without weights, vertices 6 and 11 would receive maximum support, so GC 
would choose one of them at random to color. If GC were using PWL, however, the 
strengths would be multiplied by the weights learned for the Advisors before tallying 
them. 



3.2 Experimental Design and Results 

Performance in an experiment with GC was averaged over 10 runs. Each run con- 
sisted of a learning phase and a testing phase. In the learning phase, GC learned 
weights while it attempted to color each of 100 problems from the specified problem 
class. In the testing phase, weight-learning was turned off, and GC tried to color 10 
additional graphs from the same class. Multiple runs were used because GC learning 
can get stuck in a “blind alley,” where there are no successes from which to learn. 
Thus a fair evaluation averages behavior over several runs. This is actually conserva- 
tive, as we argue below that one could reasonably utilize the best result from multiple 
runs. 

Problems were generated at random, for both learning and testing. Although there 
is no guarantee that any particular set of graphs was distinct, given the size of the 
problem classes the probability that a testing problem was also a training problem is 
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extremely small. The fact that the training set varied from one run to another is, as we 
shall see, an advantage. 

We ran experiments on five different problem classes: 4-colorable graphs on 20 
vertices with edge densities of 10%, 20%, and 30%, and 4-colorable graphs on 50 
vertices with edge densities of 10% and 20%. (Edge densities were kept relatively low 
so that enough 4-colorable graphs could be readily produced by our CSP problem 
generator. Those classes contain 36, 53, 70, 167, and 285 undirected edges, respec- 
tively.) To speed data collection, during both learning and testing, GC was permitted 
no more than 1000 task steps for the 20-vertex graphs, and 2000 task steps for the 50- 
vertex graphs. (A task step is either the selection of a vertex, the selection of a color, 
or the retraction of a color.) 

We evaluated GC on the percentage of testing problems it was able to solve, and 
on the time it required to solve them. As a baseline, we also had GC attempt to color 
100 graphs in each problem class without weight-learning. These results appear in 
Table 1 as “no learning.” As Table 1 shows, weight learning (“yes” in Table 1) sub- 
stantially improved GC’s performance in all but the largest graphs. With weight 
learning, GC solved more problems and generally solved them faster. With weight 
learning, the program also did far less backtracking and required 32%-72% fewer 
steps per task. 

An unanticipated difficulty was that, in the 50-vertex- 20%-density class, GC was 
unable to solve any problem within the 2000- step limit, and therefore could not train 
its weights and improve. We therefore adapted the program so that it could learn in 
two other environments. With transfer learning, GC learned on small graphs but 
tested on larger graphs of the same density. With bootstrap learning, GC learned first 
on 50 small graphs of a given density, then learned on 50 larger graphs of the same 
density, and then tested on the larger graphs. Table 1 reports the result of both boot- 
strap learning and transfer learning between 20-vertex and 50-vertex-classes of the 
same density (e.g., from 20-vertex-20%-density to 50-vertex-20%-density). 



Table 1. A comparison of GC’s performance, averaged over 10 runs. Time is in seconds per 
solved problem; retractions is number of backtracking steps per solved or unsolved problem. 



Vertices 


Edges 


Learning 


Solntions 


Time 


Retractions 


20 


10% 


no 


95% 


0.22 


22.28 


20 


10% 


yes 


100% 


0.11 


0.00 


20 


20% 


no 


35% 


1.11 


418.16 


20 


20% 


yes 


83% 


0.48 


79.63 


20 


30% 


no 


12% 


1.40 


631.60 


20 


30% 


yes 


41% 


1.43 


427.05 


50 


10% 


no 


1% 


3.23 


815.82 


50 


10% 


yes 


46% 


1.02 


414.29 


50 


10% 


transfer 


32% 


4.16 


428.54 


50 


10% 


bootstrap 


40% 


3.62 


382.18 


50 


20% 


no 


0% 


— 


— 


50 


20% 


yes 


0% 


— 


— 


50 


20% 


transfer 


26% 


5.09 


486.61 


50 


20% 


bootstrap 


20% 


4.51 


519.89 




Collaborative Learning for Constraint Solving 55 



3.3 Discussion 

The most interesting results from our case study are reflected in the resultant learned 
weights. In the 20-vertex- 10%-density experiment, where every test graph was col- 
ored correctly, on every run only the Advisors Max Degree, Min Domain, and Min 
Backward Degree had weights high enough to qualify them for use during testing. In- 
spection indicated that in the remaining experiments, runs were either successful (able 
to color correctly at least 5 of the 10 test graphs), or unsuccessful (able to color cor- 
rectly no more than 2 test graphs). The 8 successful runs in the 20-vertex 20%-density 
experiment solved 95% of their test problems. In the 20-vertex-30%-density experi- 
ment, the 6 successful runs solved 65% of their test problems. On the 50- vertex- 10%- 
density graphs, the 6 successful runs colored 76.7% of their test graphs. Inspection 
indicates that a run either starts well and goes on to succeed, or goes off in a futile di- 
rection. Rather than wait for learning to recover, multiple runs are an effective alter- 
native. As used here then, GC can be thought of as a restart algorithm: if one run does 
not result in an effective algorithm for the problem class, another is likely to do so. 

For each problem class, the Advisors on which GC relied during testing in success- 
ful runs appear in Table 2. Together with their weights, these Advisors constitute an 
algorithm for vertex selection while coloring in the problem class. Observe that dif- 
ferent classes succeed with different weights; most significantly the sparsest graphs 
prefer the opposite Backward Degree heuristic to that preferred by the others. 

The differences among ordinary GC learning on 50-vertex- 10%-density graphs, 
and transfer and bootstrap learning from them with 20-vertex- 10%-density graphs, are 
statistically significant at the 95% confidence level: ordinary learning produces the 
best results, followed by bootstrap learning (where weights learned for the smaller 
graphs are tuned), followed by transfer learning (where weights for the smaller graphs 
are simply used). This further indicates that 20-vertex- 10%-density graphs and 50- 
vertex- 10%-density graphs lie in different classes with regard to appropriate heuris- 
tics. Although solution of 50-vertex-20%-density graphs was only possible with with 
transfer or bootstrap learning, these are not our only recourses. We could also extend 
the number of steps permitted during a solution attempt substantially, on the theory 
that we can afford to devote extended training time to produce efficient “production” 
algorithms. 

In this study, we attempted to “seed” GC with an “impartial” set of alternative 
vertex characteristics. Two factors previously considered individually by constraint 
researchers in a general CSP context as variable ordering heuristics, minimal domain 
size and maximal degree, were selected in all successful runs. Moreover, the combi- 
nation of the two is consistent with the evidence presented in [23] that minimizing 
domain-size/degree is a superior CSP ordering heuristic to either minimizing domain 



Table 2. Learned weights for those GC vertex-selection Advisors active during testing, avera- 
ged across successful runs in five different experiments. 50-vertex values are from bootstrap le- 
arning. 



Advisor 


20-10% 


20-20% 


20-30% 


50-10% 


50-20% 


Max Degree 


0.678 


0.678 


0.743 


0.547 


0.678 


Min Domain 


0.931 


0.841 


0.713 


0.841 


0.723 


Min Backward Degree 


0.943 


— 


— 


— 


— 


Max Backward Degree 


— 


0.862 


0.724 


0.852 


0.716 
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size or maximizing degree alone. Given the relatively recent vintage of this insight, its 
“rediscovery” by FORR is impressive. Min Backward Degree corresponds to the 
“minimal width” CSP variable ordering heuristic, and again FORR was arguably in- 
sightful in weighting this so heavily for the 20-10 case, since it can guarantee a back- 
track-free search for tree-structured problems [24]. The success of Max Backward 
Degree for the other classes may well reflect its correlation with both Min Domain 
(the domain will be reduced for each differently colored neighbor) and Max Degree. 

In a final experiment we implemented the classic Brelaz heuristic for graph color- 
ing within FORR by simply eliminating any vertex that does not have minimum do- 
main in tier 1 and then voting for vertices with maximum forward degree in tier 3. 
Table 3 shows the results. Note that GC, learning from experience, does considerably 
better. 



4 Future Work 

GC is a feasibility study for our planned Adaptive Constraint Engine (ACE). ACE 
will support the automated construction of problem-class-specific constraint solvers in 
a number of different problem domains. Automating constraint solving as a learned 
collaboration among heuristics presents a number of specific opportunities and chal- 
lenges. EORR offers a concrete approach to these opportunities and challenges; in 
turn, ACE provides new opportunities and challenges to extend the EORR architec- 
ture. 



4.1 Opportunities 

We anticipate a range of opportunities along four dimensions: 

• Algorithms: Algorithmic devices known to the CSP community can be specified as 
individual Advisors for ACE. Advisors can represent varying degrees of local search 
or different search methods (e.g., backjumping), and they can represent heuristic de- 
vices for variable ordering or color selection. ACE could be modified to employ other 
search paradigms, including stochastic search. 

• Domains: ACE will facilitate the addition of domain-specific expertise at varying 
degrees of generality, and in various fields. Eor example, we might discover variable 
ordering heuristics for a class of graphs or general graphs, for employee scheduling 
problems or general scheduling problems. 



Table 3. A performance comparison of GC with the Brelaz heuristic. “GC best” is the top- 
performing runs with GC. Brelaz comment frequencies are provided for Min Domain (MD) and 
Max Forward Degree (MFD). 

Number of solutions Time in seconds Comment frequency 



Vertices 


Density 


Brelaz 


GC 


GC best 


Brelaz 


GC 


MD 


MFD 


20 


10% 


86% 


100% 


o 

o 

o 


0.33 


0.11 


15.99 


34.01 


20 


20% 


26% 


83% 


95.0% 


1.19 


0.48 


3.77 


64.41 


50 


10% 


0% 


46% 


76.7% 


1.71 


1.23 


11.27 


13.11 
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• Change: We will begin by learning good algorithms for a static problem or problem 
class, that is, good weights for a set of prespecified Advisors. In practice, however, 
problems change. For example, a product configuration problem changes when a new 
product model is introduced. ACE will offer opportunities to adapt to such change. 
Furthermore, ACE should be able to adapt to changing conditions during a single 
problem-solving episode. (The FORR architecture has proved resilient in other dy- 
namic domains.) 

• Discovery: We can select among standard techniques, for example, minimal domain 
variable ordering. We can combine these techniques, through a variety of weighting 
and voting schemes. Most exciting, we can learn new techniques, in the form of use- 
ful knowledge and new Advisors. These will include planners at the tier-2 level. Some 
preliminary work on learning new Advisors based on relative values of graph proper- 
ties (Later is an example of such an Advisor, albeit prespecified here) has shown both 
improved solution rates and considerable speedup. 



4.2 Challenges 

Exploring these opportunities will require progress in several areas. Basically, we 
need to provide the elements of a collaborative learning environment. FORR permits 
us to begin addressing this challenge quickly and concretely. 

• Advice: Many interesting issues arise in appropriately combining advice. With vari- 
able ordering heuristics, for example, we can now move beyond using secondary heu- 
ristics to break ties, or combining heuristics in crude mathematical combinations. Or- 
dering advice can be considered in a more flexible and subtle manner. The challenge 
lies in using this new power intelligently and appropriately. In particular, this may re- 
quire new voting schemes, such as partitioning FORR’s tier 3 into prioritized subsets. 
Such higher order control could be learned. 

• Reinforcement: Opportunities to learn can come from experience or from expert 
advice. ACE will provide a mechanism to generalize experience computed from ex- 
haustive analysis or random testing. It will also provide a mechanism for knowledge 
acquisition from constraint programming experts and domain experts. In particular, 
we expect that ACE will be able to extract, from domain expert decisions, knowledge 
that the experts could not impart directly in a realizable form, thereby addressing the 
knowledge acquisition problem for constraint programming. Specific reinforcement 
schemes, analysis, and experimental protocols are required to accomplish this. For 
example, what is the proper definition of an “optimal” variable ordering choice, and 
what forms of experiment or experience will come closest to modeling optimality? 

• Modeling: We need languages for expressing general constraint solving knowledge 
and domain specific expertise. Such languages will support discovery of useful 
knowledge and new Advisors. They will enable us to learn the context in which tools 
are to be brought to bear. For example, a grammar has been formulated for a language 
that compares relative values (e.g., <, =) of vertex properties (e.g., degree, number of 
colored neighbors); this grammar can be used to formulate learned Advisors. 

Modeling constraint solving and domain knowledge to facilitate discovery presents 
perhaps the most exciting combination of opportunity and challenge. The feasibility 
study already gives us a glimpse of this capability. The features we used, involving 
domain size and degree, are basic features of a constraint graph model of a problem. 
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and of the coloring domain in particular. We simply described possible variations on 
those features, and let GC “discover” which ones most effectively contributed to con- 
trol of variable ordering during search. 

More broadly, we envision modeling constraint satisfaction search as movement 
through a space of sets of potential solutions (the power set of the Cartesian product 
of the variable domains). Conventional algorithms may be viewed as special cases of 
movement through that space. Backtrack search operates by fixing one value at a time 
and moving to the Cartesian product of the remaining (pruned) domains. Hill climb- 
ing operates by moving from one singleton set to another. ACE can explore the vast 
realm of intermediate algorithms. We envision expanding FORR to operate with sets 
of possible states, an extension of the architecture that would facilitate exploration of 
algorithms modeled in this manner. 

In FORR, not all Advisors are prespecified. Given a language from which to de- 
velop them, and a learning method, a FORR-based program can acquire and integrate 
new, problem-class-specific Advisors into tier 3. For example, Hoyle, the FORR- 
based game player, has learned two different kinds of game-specific Advisors from 
perceptual data [25]. The Advisors Hoyle learns provide insight into the nature of a 
game, extend its representational ability, and substantially improve the program’s per- 
formance. The mechanism for learning new Advisors is sketched in [26] and detailed 
in [25]. 

This work will motivate numerous enhancements to FORR. For example, we hope 
to learn to sequence the tier-1 Advisors, rather than prespecify their order. We will 
also work on collaborative planning in tier 2. We intend to add some generic, re- 
source-bounded versions of forward search. We will partition tier 3 based upon 
learned weights, and then prioritize the allocation of resources accordingly. Finally, 
we expect to explore weight-learning algorithms that are more domain-specific. In 
that context, we expect to consider non-linear voting algorithms, including pairs of 
Advisors as in WINNOW [27]. 



5 Conclusions 

The combination of constraint programming and Advisor-based collaborative learning 
is an innovative approach toward making constraint software more effective and more 
widely available. Our Adaptive Constraint Engine will provide a comprehensive ar- 
chitecture for acquiring and controlling collaborative and adaptive constraint solving 
methods. 

The FORR architecture supports this frontier CSP research by transforming amor- 
phous objectives (“reinforce success”) into concrete ones (“reward Advisors”). The 
CSP research will in turn motivate major extensions of FORR facilities. A case study 
has demonstrated the potential of our project; it constructed different algorithms for 
different classes of graphs, “rediscovered” some constraint solving insights, and out- 
performed the Brelaz heuristic. 
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Abstract. Constraint Programming (CP) is a very general programming para- 
digm that proved its efficiency on solving complex industrial problems. Most 
real-life problems are stochastic in nature, which is usually taken into account 
through different compromises, such as applying a deterministic algorithm to 
the average values of the input, or performing multiple runs of simulation. Our 
goal in this paper is to analyze different techniques taken either from practical 
CP applications or from stochastic optimization approaches. We propose a 
benchmark issued from our industrial experience, which may be described as an 
Online Multi-choice Knapsack with Deadlines. This benchmark is used to test a 
framework with four different dynamic strategies that utilize a different combi- 
nation of the stochastic and combinatorial aspects of the problem. To evaluate 
the expected future state of the reservations at the time horizon, we either use 
simulation, average values, systematic study of the most probable scenarios, or 
yield management techniques. 



1 Introduction 

One of Constraint Programming (CP) major claims to success has been its application 
to solving complex industrial application problems. The strength of CP is its ability to 
represent complex domain-dependent constraints, which yield interesting modeling 
abilities, that have been completed, over the previous years, with resolution techniques 
such as meta-heuristics [CLS99]. Interestingly, most industrial combinatorial prob- 
lems are stochastic in nature, due to the uncertainty that is characteristic of real world 
situations. Our own experience with industrial problems include construction planning, 
call center scheduling, equipment inventory management or TV advertisement book- 
ing, to name a few. In all these situations, the resolution of a static problem is only a 
compromise, since the data that is given to the algorithm only reflects the situation that 
is expected in the future. As a consequence, the algorithm needs to be run often and 
incrementally, to adjust to real-time events that modify the validity of the solution. 
When the future is widely unpredictable, this may be the wisest thing to do, but most 
often, probabilistic information is available (mean, standard deviation and distribution 
characterization), that should be used to derive more robust solutions. 

T. Walsh (Ed.): CP 2001, LNCS 2239, pp. 61-76, 2001. 
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When confronted with this challenge, practitioners of constraint programming have 
often developed and used ad hoc techniques. There is a large lore of practical advise 
available in the community, but very little of it has been formalized. Three strategies 
may be observed: 

1. The simplest is to use a static combinatorial algorithm using the expected values 
(means) as the input. 

2. Another simple approach is to use simulation to compare possible decisions, mak- 
ing multiple runs of the algorithm using different sets of input data that are gener- 
ated following the probability information that is available. This is easy to imple- 
ment but places a heavy burden on the run time. 

3. Hybrid approaches may be developed, that try to introduce the stochastic nature 
into the general design of the algorithm. For instance, a strategy that was reported 
to the authors in the field of industrial process control is to couple a CP solver that 
solves a resource-constrained scheduling problem with a stochastic generator that 
produces “ghost” orders in the future and with a matching algorithm that links up- 
coming orders with ghosts when appropriate. Thus the scheduler runs continuously 
(and incrementally) on a flow of tasks that contains both real and predicted orders. 
The nearby field of stochastic optimization has produced over the last 30 years a 

wealth of impressive results, some of which being very relevant to this issue ([BL97] 
for a survey). These results both cover the stochastic behavior of classical combinato- 
rial algorithm or heuristics (such as the WSPT rule from W. Smith) or the definition of 
new, stochastic, algorithms [HSS-i-96,ABC-t-99]. However, these results rely both on 
the relative simplicity of the static algorithm that is analyzed stochastically, and many 
independence hypotheses about the input data that are required to make the analysis 
work. 

The combination of NP-hard problems and stochastic input data has been studied 
under the field of online algorithms [FW97]. Competitive analysis has provided some 
upper bounds for the competitive ratio, which is the ratio between the worst-case be- 
havior of the online algorithm and the optimal solution given a complete knowledge of 
the input. However, there are few practical results and fewer practical good heuristics 
available for these problems. The consequence is that simulation is required for an- 
swering most of the characterization questions that we may have, as soon as the prob- 
lem becomes too difficult. For instance, finding the probability that the makespan of a 
simple scheduling problem with precedence (that would be solved statically with a 
PERT) is higher than a given value is achieved with a Monte-Carlo simulation (more 
precisely, with a pseudo-MC simulation [AW96]). The scheduling problems that we 
address in our industrial applications are NP-hard (contrary to the PERT problem) and 
the stochastic characterization of the complex branch-and-bound algorithms that are 
used to solve them is an open issue. 

Our goal with this paper is to propose a first contribution that creates a bridge be- 
tween the two communities. On the one hand, we want to address combinatorial 
problems with complex constraints while taking the stochastic nature into account. On 
the other hand, we want to follow a scientific approach, using public benchmarks, and 
a survey of different techniques that either come from the field of stochastic pro- 
gramming or from the experience of CP practitioners. Thus our goal is threefold: 
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1 . Compare the relative strengths of different techniques that we have used in the past 
on different problems, as well as methods that we have been referred to by re- 
searchers from the stochastic programming community, including “Yield Manage- 
ment”. YM has been popularized by the airline industry and is finding its way into 
many industrial domains. 

2. Propose directions that may lead to better algorithms, either through the stochastic 
characterization of complex CO techniques or through the introduction of “combi- 
natorial insights” into stochastic methods. 

3. Publish a benchmark problem that is representative of our industrial problems and 
holds a strong combinatorial component, yet that is simple enough so that many 
other methods can be applied and compared. 

The experience that we have proposed as a benchmark comes from an online reser- 
vation system. It has been simplified and can be described as a multi-choice knapsack 
with deadlines. It represents the reservation problem faced by a tour operator that 
operates a set of sites and receives reservation requests, as well as cancellations, for 
groups of tourists. The goal is to maximize the final occupancy of the hotels (at a 
given deadline), given a penalty that is given for overbooking. The algorithm that 
needs to be built receives the stream of reservation/cancellation events and is required 
to accept or decline each reservation. 

The paper is organized as follows. The next section gives a detailed presentation of 
our benchmark problem and an associated best-fit dynamic algorithm that will be used 
as a reference. The algorithms presented in this paper are dynamic algorithms that 
computes, each time they are called, an expected valuation of the current state with 
and without the incoming reservation. The difference between the various approaches 
is the technique that is used to make this valuation. Section 3 presents a set of ap- 
proaches that are based on simulation, using a reasonably simple strategy to fill the 
hotels (bins). Section 4 presents on the contrary an approach that uses a more sophisti- 
cated filling algorithm but that is only applied to mean values. Section 5 presents a 
technique that is based on the characterization of the most representative scenarios of 
demand/cancellation and then solves heuristically the problem for each scenario. Sec- 
tion 6 presents a Yield Management approach, which is based on the characterization 
of the expected marginal revenue over the duration of the scenario. The last section 
gives a preliminary comparison of the results of these different approaches. 

2 Problem Description 

2.1 Mathematical Description 

Let b„ b„ ... b„ be Mbins of respective capacities C„ C„...C„ and consider K types of 
items of sizes w„ w^, .. w,, and values v„ v„ .. v,. The numbers of items of each type 
are positive integer random variables In^. The arrivals of these items are distributed 
into T periods indexed from 0 to T-1, according to a common repartition function /(t). 
During each period, previously arrived items have a probability to leave. 

A state of the packing is an A x A" matrix P of positive integers, such that P[n,k] is 
the number of items of type k present in bin b,,. A strategy is a function s(P,k,t) re- 
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turning an integer in [0,N], denoting in what bin an item of type k arriving at date t 
should be put (0 standing for refusal). 

All bins are initially empty. At each date t £ [0,T-1], a list of events is presented 
one after another, in random order. These events can be ; 

• The departure of an item of type k (that was present at date t) from bin n: then 
P[n,k] is decreased by one 

• The arrival of an item of type k: then if s(P,k,t) is non null, P[s(P,k,t),k] is increased 
by one. 

The objective of the strategy is to maximize the expected value of the following 
function at the end of the process: 
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where a is a penalty coefficient. 



2.2 Corresponding Situation 

This problem could be that of a travel agency, filling holiday centers with school 
groups, in presence of stochastic demand and cancellation. In this case, bins and items 
would be respectively holiday centers and stays of school groups. We consider the 
optimization of the filling during a specific week (say Christmas week). Stays are 
booked by school academies, each of them characterized by the size of the groups it 
organizes (deterministic) and the price it pays for each group (not necessarily related 
to the group size). Until Christmas, these academies can book stays for this festive 
week, but they cannot specify in which holiday center they would like to go. On the 
other hand, when the agency accepts to book a stay, it informs the group of its desti- 
nation, and cannot modify it later. School academies can also cancel previously 
booked stays (with no penalty). The goal of the agency is to maximize its final profit 
under the following overload constraint: for each holiday center, each supernumerary 
person must be accommodated in a nearby hotel at fix cost a. 



2.3 Scenarios 

In all considered scenarios, bins have identical capacities (normalized to 100) the 
repartition function /( t) is uniform and arrivals follow simple integer distributions 
(binomial). Experiments were conducted on more than 20 scenarios, based on a 
“master” scenario detailed below: 

• 5 bins, 10 periods, 5 types of items. 

• sizes of items are respectively (17,20,25,30,33) 

• values are respectively (13,26,21,26,39) 

• Overload penalty a=10 (see equation (1)) 

• For all k: q =0.066967 (^initial leaving probability = 0.5) and Inii= _8_ (^average=8) 

J.K 
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2.4 Naive Strategies and Far-Seeing Strategies 

Some strategies do not take into account the stochastic aspects of the problem. For 
instance the First-Fit (also named First-Come First-Serve) consists in putting arriving 
requests in the first possible bin (if any). Otherwise it is refused (no overbooking). The 
Best-Fit strategy is a refinement of the First-Fit: the arriving item is put in the possible 
bin with the smallest current remaining size. This Best-Fit strategy (BF) will be used 
in the following to normalize results on different scenarios. 

A far-seeing or clairvoyant strategy is a strategy that knows exactly what will hap- 
pen in the future, and deals therefore with a deterministic problem. Hence it is not a 
feasible strategy but only a hypothetic one that can be useful to compute upper bounds. 
In our case a far-seeing strategy knows exactly what requests will arrive, at which 
dates, and when they will be cancelled. Then it can extract the number of requests of 
each type that will not be cancelled before the final date T. Hence the associated de- 
terministic problem is a multi-choice knapsack. This deterministic problem can be 
optimally solved using integer programming (less than 5 mn with XPRESS-MP). 



3 Forward Sampling (FS) 

As explained in the introduction, our dynamic algorithms are based on state evalua- 
tion. When a request arrives, each possible decision leads to a different state. After 
evaluating all states with a certain method, the decision leading to the best state is 
chosen. 

A simple idea to evaluate a state is to generate scenarios from the current date to 
the end (date T), or more precisely from the beginning of the next period to the end of 
the final one (period T-1). Hence the evaluation of a state will be an aggregation (usu- 
ally the average) of the evaluations of “many” generated scenarios (Monte-Carlo 
simulations). 

To generate these scenarios, it is necessary to compute the conditional distribution 
of the remaining number of arrivals of each type. Several methods can be used to 
evaluate a scenario: we can simulate the behavior of a Best-Fit strategy on this sce- 
nario, or the behavior of a far-seeing strategy (exact or greedy approximation), or the 
behavior of any other “slave” strategy (for instance other strategies described in this 
paper). 

These first experiments tend to reveal that the results of this strategy are not so de- 
pendent on the quality of the slave algorithm. For instance, evaluating with a Best-Fit 
or with a far-seeing algorithm lead to very similar results, whereas the former is a 
pessimistic evaluation (the final gain is significantly higher in general) and the latter is 
an optimistic evaluation (it gives an upper bound that cannot be reached). Besides, 
using an evaluation with the strategy described in the next section does not improve 
the results, whereas the behavior of this strategy is very similar to that of Forward 
Sampling. Moreover, forgetting to take into account the possibility for items already 
present to be cancelled caused a surprisingly small deterioration of results. Finally, 
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since what is important is the sensitivity of this algorithm used for comparisons, a 
greedy heuristic similar to Cl in Section 4 seems to be a good slave strategy. 

On the contrary, Table 1 shows that the number of samples has a great influence on 
the average gain: even if 8 samples are sufficient to get better results than the Best-Fit 
(1.67 %), increasing this number up to 1000 leads to an average gain of 8.23%. 

Table 1. Influence of the number of samples (1000 runs on the “master” problem) 



Number of samples 8 10 20 50 100 1000 

average gain (%) L67 Z8 "C85 T89 T95 8.23 



4 Using Expected Values and Combinatorial Optimization (EV) 



This second approach is similar to the previous one in its global principle, but uses a 
precise resolution of a combinatorial problem that is generated from the expected 
values of the departure s/arrival as an oracle of what may happen. To evaluate a current 
state, we try to compute the mean of the optimal “far-seeing” (a posteriori) filling of 
the bins. If we designate by C an algorithm to solve this multi-choice knapsack prob- 
lem, and by C* an optimal algorithm, we can say that we want to use E(C*) as the 
expected value to guide our decision. 



1 

In the previous section we used — X^C(5i) to evaluate E(C*), making use 



simple heuristic C to fill the bins. Here we will use a better algorithm C* but we will 
run it only once, over the mean values of the a posteriori problem, yielding C*( E(S)). 
One can say that the goal of this paper is to evaluate different strategies to produce an 
estimate of E(C*), and this section deals with the C*(E) approach, where C* is ap- 
proximated by C-H. 

This points out to another, more complex but possibly better suited, direction for 
producing online algorithms. When we evaluate the future, we use an optimal a poste- 
riori strategy, whereas it would be more interesting to use recursively the strategy that 
is being defined. This type of recursive definition, where the optimal decision at date t 
is defined using the future decisions at time t + §, may be solved (numerically) using 
differential equations when the problem is simple enough, but is, to our knowledge, 
too difficult for this present problem. 

However, it is clear that we cannot evaluate the expected penalty for overbooking 
using an a posteriori strategy: in the a posteriori scenario, we decide to simply ignore 
those reservations that we do not consume for our optimal filling. To take the penalty 
into account, we add a simple lower bound estimate of the penalty that is expected 
based on the current balance and the expected cancellation rate. Note that this lower 
bound is only “exact” if we were to refuse all future incoming reservations, which 
shows that there is an opportunity for a better analysis and a more precise algorithm. 

Thus, we can summarize the online algorithm as follows: 

1 . Compute expected reservations and reservations from date to deadline 

2. Produce average deadline scenario S and compute reference value V = Ch-(S) 
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3. For each bin 

compute V = C+(S) + ExpectedPenalty(current state) with item added to b 
pick b that maximizes v using balance(fo) to break ties 

4. If V >= V, accept item and place it into bin b, otherwise refuse item 

As noticed in the previous section, the algorithm is not very sensitive to the quality 
of the cancellation prediction, although it is important to use a conservative estimate. 

The combinatorial algorithm is a constraint propagation algorithm that uses a lim- 
ited form of branching with a LDS scheme [HG95]. The branching strategy is to pick 
which kind of items should be inserted into the current solution and then to branch on 
all possible bins. To evaluate which bin to pick, we measure the difference in the 
maximal expected value for the bin before and after the insertion. This maximal value 
is the solution of the single knapsack problem for the bin, applied to all incoming 
reservation. We use a LDS scheme, and decide to branch on which bin to pick when 
there are two bins for which the difference between the two valuations is small 
enough. The single knapsack is solved with another LDS algorithm that is very similar 
but much simpler since the value of an insertion is simply the current value of the bin. 

The next table addresses the importance of the quality of the Ch- algorithm. We give 
here the results obtained respectively with a naive greedy heuristic (Cl), with a 
smarter algorithm with one level of branching (C2) and last with our complete algo- 
rithm presented here (C-t). In this table, we have included the standard deviation to 
substantiate our claims about the statistical relevance of the experiment, thus each cell 
in the table is a tuple (average, standard deviation). 



Table 2. Comparison of C1,C2 and C-l- 



Problem 


BF 


Cl 


C2 


c+ 


Master 


454 , 27.7 


481 , 32.9 


484 , 33.6 


487 , 34.8 


Pbl 


454 , 29.1 


489 , 30.2 


492 , 31.6 


494 , 32.7 


Pb2 


469 , 28.9 


496 , 34.5 


499 , 34.0 


504 , 34.8 


Pb3 


431 , 47.4 


448 , 49 . 4 , 


444 , 34.9 


449 , 49.9 


Pb4 


82 , 18.1 


83 , 17.6 


83 , 17 . 6 , 


83 , 17.6 


Pb5 


453 , 29.9 


481 , 34.1 


484 , 34.9 


487 , 35.6 


Pb6 


11782 , 797.6 


12852 , 1023.7 


12935 , 1024.3 


12951 , 1034.6 


Pb7 


938 , 51.5 


906 , 53.4 


922 , 56.2 


938 , 51.4 


Pb8 


458 , 26.5 


519 , 29.6 


528 , 30.5 


538 , 30.4 


Pb9 


421 , 48.5 


418 , 50.4 


419 , 49.1 


420 , 48.7 



A more traditional approach in the field of stochastic programming is to concen- 
trate on whether a new request should be accepted, using a stochastic evaluation, and 
then simply use a best-fit algorithm to pick which bin must be used. This is for in- 
stance the case for most “Yield Management” approaches, including the one that will 
be presented in Section 6. Therefore, we have conducted the following experiment, 
where the decision of which bin to pick is left to a best-first heuristic, but we use the 
same analysis (C-i-(L) with or without the new reservation) to decide whether to accept 
or refuse the reservation. 

One may see that the impact of the combinatorial choice between the bins depends 
on the type of problem. This table is useful when comparing with results obtained with 
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our preliminary implementation of the Yield Management approach in Section 6, since 
it is similar to the binary approach presented here. 

Although this approach presented in this section produces satisfactory results, it still 
ignores the true stochastic nature of the problem. To remedy this drawback, there are 
two possible directions: 



Table 3. Full exploration vs. binary strategy 



Problem 


Full exploration 


Binary (simpler) strategy 


Master 


487, 34.8 


481, 39.5 


Pb 1 


494, 32.7 


433, 64.1 


Pb2 


504, 34.8 


493, 39.5, 


Pb3 


449, 49.9 


388, 60.2 



1. Develop techniques borrowed from stochastic optimization to better characterize 
E(C). This is the approach that we plan to investigate in the future. Because the al- 
gorithm Ch- is fairly complex and uses various heuristics, it may actually be easier 
to characterize C*, that is the optimal resolution of the combinatorial problem and 
derive information that also applies to C-t. 

2. Characterize the stochastic nature of the problem, either with a segmentation, 
which will be developed in the next section, or through cutting rules, which will be 
shown at the end of Section 6. 

5 Combinatorial Analysis (CA) 

5.1 Related Markov Decision Processes 

MDP is the model of choice [H60, P94] to build optimal policies for stochastic deci- 
sion problems. Under the assumptions that the arrival laws (/nj and departure laws 
are time-independent and non correlated, the online Multi-choice Knapsack Prob- 
lem can be modelled as a fully observable MDP, with an obviously large number of 
states. 

Classic techniques for solving MDPs like Linear Programming [D63], policy itera- 
tion [H60] or value iteration [B57] are not suitable in the case of large state space and 
advocate the development of MDP-decomposition methods. Various approaches have 
been proposed for the latter, among which: adaptative aggregation [BC89], Dantzig- 
Wolfe Decomposition [KC74] or, for planning problems, [BDG95, DL95, BT96]. 
More recently, new decomposition schemes have been developed that exploit the 
“weakly coupled structure” of some MDP that occur in online resource allocation: 
policy caches [P98], dynamic MDP merging [SC98] or Markov Task Decomposition 
[MHKh-98]. 

Our approach consists in a crude implementation of an event-aggregation strategy, 
focussing on regions of final states of the MDP. 

5.2 Combinatorial Sampling 

A particularity of our stochastic optimization problem relies in the fact that the reve- 
nue is evaluated at the very end of the process. In a sense, the ordering of the incoming 
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events has little importance as no immediate reward or holding cost influence the 
revenue, provided that the strategy is efficient. The method in this Section takes ad- 
vantage of this context and is based upon the following property holding for the de- 
terministic case: at time t, if all external events (incoming items, departing items) are 
known in advance, an optimal strategy consists in choosing for the item the bin that 
maximizes the revenue of the Multi-choice Deterministic Knapsack Problem for 
which: 

• All already accepted items that will not be cancelled are forced into the solution. 

• All incoming items that will not be cancelled are possible candidates for filling the 
bins. 

Though NP-hard, the afore-mentioned combinatorial optimization problem can be 
solved with on-the-shelf MIP tools (see [MT90] for a deep description of knapsack- 
like problems). Let F'(P[n,k],j,t) denote the optimal Deterministic Knapsack value 
over all possible choices for an item of type j arriving at t. We refer the reader to 
[WB92] or [PRK95] for discrete and continuous markovian models related to online 
knapsack and perishable asset revenue management problems. The solution produced 
by the MIP is the best (in terms of revenue) valid “state” that is reachable from the 
current state of the system taking corresponding decisions of acceptations and reject. 

For the non-deterministic case, our evaluation strategy would ideally aim at enu- 
merating all possible event sets, compute their occurrence probability together with 
the Deterministic Knapsack revenue, hence deduce the overall expected value of each 
possible choice of bin for item j: E[F*]. 

Not surprisingly, an obvious drawback of this strategy is its computational time. In 
order to estimate the revenue of each possible final state of the process that are reach- 
able from P[n,k] taking into account the choice at stake, one has to generate all possi- 
ble pairs ( ( n ,, P^( n,k)) where : 

• A^.is the number of items of type i arrived after t and remaining at T if accepted 

• P^(n,k) is the number of remaining items of type k, already present in bin n before t 
and still present in bin n at T. 

with a computational cost of 0(P'‘" . (D/Kf), if P denotes the maximum number of 
items of the same type present in the bins at f, and D the maximum number of items 
that can enter the system between t and T. Thus we limited our final states study to the 
most probable states, above a given threshold, restraining ourselves to typically 10000 
states at all. Finally, since evaluating each final state by computing the best MIP solu- 
tion can potentially require a couple of minutes, we use the Cl heuristic instead (de- 
scribed in section 4). 

The experimental results not surprisingly indicate that this “combinatorial” sam- 
pling offer results that are close to the forward sampling strategy of section 3. 
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6 Yield Management (YM) 

Yield Management techniques analytically estimate impact on the expected final reve- 
nue. As with thresholds principle [GR99], at each new arrival is computed expected 
final revenue and the item is accepted if its marginal revenue improved it. 

In a first step, we will describe the computation of the final total balance (FTB) and 

the final total revenue ( F ) in a deterministic vision. In a second step, we extend these 
notions for the stochastic case (probability and penalties) and deduce the stochastic 
expected final revenue with penalties { (j) ). 

Finally, we observe that YM results are better than those of the naive Best-Fit (BF) 
strategy in most of the cases. Limitations are two fold: multi choice blindness (the 
multi bin aspect wasn’t involved in the formulas) and packing blindness (the major 
limitation). The influence of this last combinatorial dimension is illustrated on some 
special instances. 



6.1 Deterministic Case 



Let us start with the simple relaxation defined by using only one bin with capacity C.N 
and one demand type with size W (average of real sizes). Since in this case, maxi- 
mizing the revenue means maximizing the balance, we compute the Final Total Bal- 
ance (FTB) witch is the sum of all arrivals. As total arrival exceeds C.N, the first 
intuition, to remove overloading penalty, is to adjust it with a ratio. We have 



that 






) maccept equal to exactly C.N when refusing arrivals with a 

V / 

(l—paccept) probability. A way to increase revenue, instead of refusing every item 
with a fixed probability, consists in applying price segregation whilst refusing only 
low price items and accepting good ones. In order to compute a threshold between low 

price and good price, we sort items types (re-numbered k ) hy decreasing marginal 
revenue {Vkiwk ) and incrementally compute the FTB by aggregating all accepted 

items. We find k* such that Bk*=^ \—Outim,i)Wi just exceeds C.N. For this 



/=1 



item type, we apply a ratio. Thus, expected revenue become: 

C.N-Bu 



k*-l 



F=y' ykInk(l—Outint,k)+Vk*Ink*( 



k=l 



Bk*—Bk*-i 






( 2 ) 



6.2 Stochastic Case 

To take into account the distribution of possible arrivals, let us focus now on the com- 
putation of the final revenue, which is a combination of expected stochastic revenue 
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(ER) decremented by expected stochastic penalty (EP). This stochastic revenue 
(j) = ER — EP is computed with the aggregated request volume, on all the items that 



will not be cancelled: Pi„(x). 






^p/n(x) [x.w—C.N\ 



( 3 ) 



V ' ' J 

To understand the threshold idea, we can describe graphically (|) as shown on next 
figure according to the level of arrival still controlled hy our ratio. In the begin- 
ning of the dashed line, the revenue for each new accepted arrival does not yield a 
penalty probability. There is a linear progression of revenue depending on amount of 
arrivals, with a rate a^. At a certain level Cj a small part of the arrival distribution 
overloads the capacity C.N. The cost function penalty (EP) starts to reduce the amount 
of additional revenue in (|). At level C^, all the additional arrivals become penalty. In 
this last portion, we have a linear behavior for the revenue with rate ajj-ocap. Now, in 
the full line, if we cumulate the price segregation with this stochastic estimation of 
revenue, we apply the same strategy on incremental computation based on (|) by ac- 
cepting every item until the final expected revenue decreases: this is the desired 



threshold item k * with his threshold revenue r . As illustrated on the next figure 
(where a^^ is the ith best marginal revenue), the price segregation offers a better reve- 
nue than the one-item vision. 




Fig. 1. Revenue as a function of the current balance 



As described in the introduction of this part, we dynamically recompute the r* thresh- 
old values to decide if we accept arrivals (hence the ratio is no longer needed). 



6.3 Limitations and Extensions 

The YM strategy results presented in the next section exhibit a better behavior than BE 
on almost all instances. The disappointing results (compared to other approaches) 
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illustrate clearly the weakness of our current approach from a combinatorial perspec- 
tive (for instance, on pb3 which packs at most 2 items by bin). 

In order to illustrate this last limitation, we have extended the master characteristics 
by only increasing the average number of item in a bin. Clearly the combinatorial 
packing aspect (master benchmark: at most 5 items by bin) is smoothed when this 
number increased. 

^ 10 . 0 % 

j. 

I 80 % 

f 6.0% 

0 

I 4.0% 

I 2 . 0 % 

S 

0.0% 

0 10 20 30 40 

average nuniber of Items 



Fig. 2. Influence of the average number of item by bin 




Although it may appear from this preliminary experiment that the YM approach is too 
difficult to tune for this type of combinatorial problems, this is not the case. For in- 
stance, we may use the YM analysis to generate simple cut-off rules that will prevent 
from accepting the lowest value/size ratio during the X first periods. Embedding these 
rules to EV (section 4) leads to a revenue 7.93% better than BE on the master problem 
(compared to 7.19% without these rules). 

7 Comparisons 

This paper is clearly a first step towards an ambitious goal and, therefore, suffers from 
many limitations. Eor instance, we plan to extend the experience to stochastic sizes of 
groups (with an associated probability distribution). Last, we have limited our first 
experiments to a size that is well suited to analysis, but larger experiments are also 
required. 

7.1 Results 

The following table compares the efficiency of our four algorithms: ES (with 1000 
samples), EV, CA and YM (respectively in sections 3,4,5 and 6). BE designs the 
Best-Eit and XMP the optimal far-seeing method quoted in 2.4 (^competitive ratios 
on the master: ES=90.6% vs. BF=83.8%). 

These results are the average results on 1000 runs so that the half width of the con- 
fidence interval at 95% is typically 0.7 %. The final column gives the average gain of 
the best algorithm (bold underlined results) against the BE. 
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One may notice that run-times vary considerably from one approach to the other. 
We designed these algorithms to be usable in a human-interaction context, with a 
response time of around one second (thus 60s for a complete iteration is acceptable). 
Obviously some online problem would require a faster approach such as YM or EV. 



Table 4. Comparative results 



Prob- 

lem 


Comment 


BF 


FS 


EV 


CA 


YM 


best 

gain 

(%) 


XMP 


Master 




454 


491 


487 


489 


465 


8.2 


542 


Pbl 


smaller penalty 


454 


500 


494 


498 


485 


10 


542 


Pb2 


various 


469 


510 


504 


509 


482 


8.7 


556 


Pb3 


bigger items 


431 


459 


449 


458 


380 


6.5 


508 


Pb4 


single knapsack 


82 


86 


83 


87 


60 


5.8 


101 


Pb6 


value = size^ (convex) 


11782 


13175 


12951 


13232 


12630 


12 


14439 


Pb7 


value =10size‘^ (concave) 


910 


948 


938 


948 


901 


4.2 


- 


Pb8 


huge arrival volume : 300% 


458 


533 


538 


513 


511 


17 


587 


Pb9 


small arrival volume: 120% 


421 


425 


420 


426 


393 


1.2 


- 


PblO 


no departures 


469 


542 


533 


524 


514 


15 


- 


Pbll 


Small items more frequent 


442 


488 


483 


487 


465 


10 


- 


Pbl2 


Big items more frequent 


464 


504 


504 


499 


472 


8.5 


- 


Pbl7 


Large standard deviation: 
35% 


452 


486 


479 


- 


464 


7.6 


- 


Pbl8 


Small standard deviation: 
12% 


453 


495 


486 


481 


458 


9.2 


- 


Pbl9 


No deviation and no departure 
(only the order change) 


468 


542 


536 


548 


481 


17 


- 


CPU 


Run time (s) for 1 scenario 
master 


0.01 


20 


0.5 


60 


0.02 







Although this is a preliminary report, these results show the interest of mixing in- 
sights from the stochastic and the combinatorial optimization field. For instance, we 
showed that using a better CO algorithm in the EV method pays off. On the other 
hand, there is balance to be found and using a sophisticated optimization technique is 
easily wasted if the level of stochastic analysis is too low. Another illustration of our 
central claim is that YM techniques used without using combinatorial insights do not 
seem competitive, but for “fluid” instances (cf. 6.3), whereas YM filters may improve 
significantly a combinatorial method. 



7.2 Robustness 

In all scenarios tested in 7.1, events are generated according to the distributions that 
are known to the algorithms. On the contrary, this section evaluates the robustness of 
our approaches, when based on a slightly erroneous estimation of the input distribution 
(as in real life): wrong law (Table 5) or wrong average (Table 6). 

Facing non-regular distributions, our strategies become very sensible to the stan- 
dard deviation (especially CA). Estimation errors on the input average have less influ- 
ence (except for YM) but in both cases, the most robust behavior consists in focusing 
on average values. 
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Table 5. Non-regular distributions 



Problem 


Comment 


BF 


FS 


EV 


CA 


YM 


best gain 

(%) 


Pb29 


non-regular distribution 
(avg:8.08409, stdev: 1.87479) 


454 


492 


484 


490 


463 


8.2 


Pb30 


“Very” non-regular distribution 
(avg:8.01173, stdev:3. 12526) 


453 


444 


454 


- 


457 


0.9 



Table 6. Influence of estimation errors on the input distribution 



estimation error in % 


-25 


-12.5 


0 


+12.5 


+25 


average gain of FS {%) 


3.83 


6.85 


8.23 


8.45 


6.93 


average gain of EV (%) 


6.15 


6.63 


6.34 


7.40 


7.71 


average gain of YM {%) 


-1 


-3.7 


2.4 


1 


- 1.7 



8 Conclusions 

We have presented different algorithms that try to solve an online combinatorial opti- 
mization problem through different techniques that borrow from different fields. Al- 
though we report preliminary results, they are stable enough to confirm the need for 
combining insights from combinatorial and stochastic programming. We argue that 
constraint programming is a relevant technique for such problems for two reasons: 

1. On the one hand, many problems for which CP is well suited because of its expres- 
sive power (its ability to capture situations that are difficult to model) have a sto- 
chastic nature 

2. On the other hand, two of the algorithms presented here can be easily adapted to a 
CP approach for a given problem: the simulation approach (FS) and the expected 
value approach (EV) 

We have identified different topics that must be investigated before we may develop a 
generic method for solving stochastic combinatorial optimization problems with con- 
straints: 

1. Flow does one obtain stochastic indicators about a lower or upper bound that is 
computed by a (limited) branch-and-bound algorithm? 

2. Flow can an abstract search space be abstracted from a stochastic description, onto 
which a combinatorial approach may be found? 

3. Flow does one develop fast, incremental CP simulation engines, which is a request 
that is also found when developing hybrid methods that combine CP and meta- 
heuristics such as stochastic methods? 

In this last question, we see that the combination of CP and stochastic has been 
mostly studied so far with the angle of introducing stochastic methods (such as ran- 
domized algorithms) as opposed to solving stochastic problems. For instance, one may 
notice the difference with the proposal of Probabilistic Constraint Programming (PCP 
[DWOO]), where probabilities are assigned to constraints as opposed to data. However, 
we plan to investigate how PCP could be used to solve this type of online problems. 







Towards Stochastic Constraint Programming 



75 



Last, we want to emphasize the availability of our benchmark problem 

(www.e-lab.bouygues.com/adhoc/prototypes/stokp), which we hope to see attempted 

by other teams with other methods. 
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Abstract. In this paper, we propose a general technique for removing 
symmetries in CSPs during search. The idea is to record no-goods, dur- 
ing the exploration of the search tree, whose symmetric counterpart (if 
any) should be removed. The no-good, called Global Cut Seed (GCS), is 
used to generate Symmetry Removal Cuts (SRCs), i.e., constraints that 
are dynamically generated during search and hold in the entire search 
tree. The propagation of SRCs removes symmetric configurations with 
respect to already visited states. We present a general, correct and com- 
plete filtering algorithm for SRCs. The main advantages of the proposed 
approach are that it is not intrusive in the problem-dependent search 
strategy, treats symmetries in an additive way since GCSs are symme- 
try independent, and enables to write filtering algorithms which handle 
families of symmetries together. Finally, we show that many relevant 
previous approaches can be seen as special cases of our framework. 



1 Introduction 

Constraint Satisfaction Problems (CSPs) occur widely in Artificial Intelligence 
and are used for solving many real life applications. In this paper, we focus on 
symmetric CSPs. A CSP is symmetric when a mapping exists that transforms a 
state in another equivalent to the first. Symmetries |12| have been identified as 
a source of inefficiency since much time is spent in visiting equivalent states. 

In recent years, symmetry removal methods have interested many researchers; 
three main approaches have been identified to remove symmetries. The first im- 
poses additional constraints to the model of the CSP; the work by Puget m 
defines valid reductions for the original problem obtained by imposing additional 
constraints, e.g., ordering constraints among variables, that enable to avoid per- 
mutations. If a valid reduction of a given CSP is proven to be unsatisfiable, 
the original CSP is unsatisfiable as well. This approach seems appealing, but 
presents two drawbacks: first, it is not always simple to find proper symmetry- 
breaking constraints in case of general symmetries; second, the search strategy 
can be influenced by the additional constraints. 

The second approach starts from a different idea: constraints able to prune 
symmetric states are introduced during search. In this setting, 0 and [2] use 

T. Walsh (Ed.): CP 2001, LNCS 2239, pp. 77-l9^ 2001. 

(c) Springer- Verlag Berlin Heidelberg 2001 
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respectively the notion of conditional constraints and entailment to avoid search- 
ing in branches leading to symmetric solutions. They add constraints to nodes 
of the search tree and consider the constraints valid globally in order to remove 
symmetric configurations upon backtracking. In m the authors detect and ex- 
ploit intensional permutation symmetries. Then, each time they detect a failure, 
they use this information by removing the value causing the failure from the 
domain of permutable variables. In |^, a similar pruning after failure filtering 
algorithm is proposed. Similar approaches are based on general notions of in- 
terchangeble values [S] and syntactical symmetries |3] which partition variable 
domains in equivalence classes: when a failure is detected for a given value, all 
values belonging to the same equivalence class are removed since they also lead 
to a failure. 

A third way of coping with symmetries is to define a search strategy that 
breaks symmetries as soon as possible. In [H] and uni, the authors propose a 
symmetry breaking strategy that selects first variables involved in the greatest 
number of local symmetries. The problem dependent search strategy is affected, 
but this approach can be very effective if symmetries are the prevailing feature 
of the problem. 

We focus on the second approach and propose a general framework for cop- 
ing with symmetries. We collect information during search, called Global Cut 
Seeds (GCSs), representing states whose symmetric counterpart, if any, should 
be removed. GCSs are no-goods. F. Bacchus in [T] provided a uniform view of 
backtracking algorithms based on no-goods. A no-good A is a set of assignments 
which cannot appear in any unenumerated solution. The notion of no-good de- 
pends only on the not yet explored search space. Thus, as soon as a solution is 
found, it becomes a no-good and is treated uniformly with other no-goods. 

In backtracking algorithms, no-goods are used for many purposes: to avoid 
searching subtrees, to prune domain values, to backtrack and to save constraint 
checks. In a symmetric problem, all these tasks can be done as well, but we 
can do something more. If a no-good is detected, its symmetric counterparts 
are no-goods as well. However, while in general backtracking algorithms no- 
goods are forgotten on backtracking, as soon as the no-good is “broken”, in 
our symmetry removal framework they are used to impose non backtrackable 
constraints, called Symmetry Removal Guts (SRGs), that act in the unexplored 
part of the search tree. Unfortunately, the number of SRGs could be exponential. 
Thus, we need to limit their number. An interesting property of GGSs is that they 
are symmetry independent, while SRGs depend on it. Thus, many SRGs, one for 
each symmetry, can be generated independently in an additive way starting from 
the same GGS or can be considered all together in a global constraint tailored for 
a family of symmetries. We propose a method for inferring GGSs during search, 
a general filtering algorithm for SRGs that prunes symmetric configurations, and 
several specializations. We show that many relevant previous approaches can be 
seen as specialization of the framework proposed. 
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2 Symmetric CSPs 

In this section, we provide preliminary notions on symmetric CSPs. A CSP is a 
triple (V, D, C) where P is a list of variables Xi, . . . , A„ ranging respectively on 
finite domains D{Xi ), . . . , D{Xn). A constraint Ci{Xi^, . . . Xi^.) defines a subset 
Si of the cartesian product of D{Xi^) x ... x D{Xi^), containing those configu- 
rations of assignments allowed by the constraint. An element t = (tii, . . . , Vk) of 
D(Xij^), . . . , D{Xi^,) is called tuple. A tuple is consistent with the constraint if it 
belongs to Si. The negative counterpart of consistency is the concept of no-good. 
A no-good A is a set of assignments which cannot appear in any unenumerated 
solution (see e.g. DP)- Thus, solutions and failures are uniformly treated. 

CSPs may exhibit symmetries; if two or more states are symmetric, they 
represent equivalent states. Thus, only one of them should be visited, while 
the others discarded. According to [S], a symmetry cr is a sequence of bijective 
mappings (Tq, cri, ..., cr„, where ao '■ V ^ V and tJi : D{Xi) D{cfQ{Xi)) that 
preserves constraints. Preserving constraints means that by applying the sym- 
metry to the variables involved in the constraint, if a tuple r is consistent with 
the constraints, also (t(t) is consistent. Thus, if a tuple r is a no-good, also cr(r) 
is a no-goocQ. Note that when ai is the identity function for every i, then ct is a 
symmetry on variables. When (Tq is the identity function, then ct is a symmetry 
on values. In the general case considered in this paper, ct can be a symmetry on 
both variables and values. 

3 Global Cut Seeds and Symmetry Removal Cuts 

The general framework proposed is based on no-goods called Global Cut Seeds 
(GCSs). Intuitively, a CCS represents a set of states whose symmetric coun- 
terpart, if any, should be removed. Consider a CSP P, and suppose a feasible 
solution s has been found for P. Since we aim at removing all solutions sym- 
metric to s, s should be “part” of the CCS. Similarly, given a state s infeasible 
for P, configuration of values symmetric to s represent infeasible states as well. 
Therefore s should also be “part” of a CCS. Formally, a CCS is defined as 
follows: 

Definition 1 (Global Cut Seed) Given a CSP P with n variables, a Global 
Cut Seed is a n-tuple A — {61,62, ■■■,Sn), where each 6i is a non empty set of 
values, such that each n-tuple {vi, . . . , t>„), V\ G ^1, ..., Vn G 6 „ is either an already 
found solution or an infeasible state for the original problem P. 

The definition of CCS uniformly treats already found solutions and infeasible 
configuration for P, as happens for no-goods. Indeed, the definition of CCS is 
nothing more than a reformulation of the definition of no-good proposed in [l] 
that is better suited to be used in the setting of symmetry removals. 

^ A symmetry cr applied to a tuple r = (ui, . . . ,v„) is defined as follows: for each 
Vi G T its symmetrical counterpart is cri(vidx(,To(Xi)))- 
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Starting from GCSs, we propose a general algorithm for pruning symmetric 
configurations based on the use of Symmetry Removal Cuts (SRCs), i.e., con- 
straints deduced during search that hold in the remaining part of the search tree. 
Intuitively, during search, we collect GCSs (independent from the problem sym- 
metries), that will be used for generating SRCs. SRCs exploit the CCS collected 
at node N for pruning symmetric configurations w.r.t. the one in N. 

Definition 2 (Symmetry Removal Cnt) Given a CSP on variables V = 
Xi,...,Xn, a Global Cut Seed A = (Si,...,Sn) and a symmetry a, a Symme- 
try Removal Cut is the constraint removesymmetric^A, V, a) imposing that 
configuration symmetric to A with respect to a are not explored in the re- 
maining search tree. Declaratively, remove-symmetric{A,V^a) holds iff 3Xi G 
V\ 2 Si where i = idx(Xi) is the index of variable Xi. 

Note that constraints removesymmetric{A, V) a) are defined as cuts since 
they are globally valid. Each time a CCS A is found, a symmetry removal con- 
straint remove-symmetric{A, V, a) is imposed removing all configurations sym- 
metric to A. Hence, removesymmetric{A, V, tr) is a global non backtrackable 
constraint. Similarly, in Mathematical Programming, Branch and Cut (see e.g. 
HI!) algorithms deduce, during search linear inequalities (cutting planes) valid 
for the original problem. 

The gap between the definition of CCS and SRC, and a practically useful 
symmetry removal filtering algorithm is huge, and several problems need to be 
considered. How to find GCSs ? How to limit their number ? How to efficiently 
use GCSs for filtering? We will answer all these questions in the rest of the paper. 

4 Filtering Algorithm for Pruning Symmetries 

Let V = {Xi, . . . , Xn} be the set of variables of a CSP P. A branching strategy 
partitions problem P{N) in a given node N by imposing additional constraints. 
With no loss of generality, we suppose that at each node of the search tree the 
problem is partitioned in two subproblems (binary branching), that the branch- 
ing strategy imposes on the left branch a positive constraint c, and on the right 
branch a nemtive constraint ->c, and that the left branch is explored before the 
right branclu- All constraints imposed from the root node to a node N are called 
branching constraints BC. 

A node N is described by a triple {Doid, Dnew, BC), where Dou is a set of 
domains prior to propagation, Dnew is the set of domains (possibly) shrunk due 
to constraint propagation after the application of the branching constraint, and 
BC is the set of branching constraints imposed to reach the node. Let f{N) be 
the father node of N, Doid of N corresponds to Dnew in f{N). At the root node. 
Bold is the set of initial domains. If one domain in Dnew is empty, a failure is 
detected and a backtracking forced. 

^ We assume that the positive branching constraint the constraint is explored first, 
and we build the search tree with the positive constraint on the left branch 
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We define a general filtering algorithm for SRCs. Given a GCS A = 
(Si, , Sn), we have to remove from variable domains all configurations of assign- 
ments that are symmetric w.r.t. values contained in the GGS. The filtering algo- 
rithm is based on this simple idea. Given a GGS A = (Si, ... , Sn), we transform 
it by applying the symmetry to each value belonging to each (5j^. Let i = idx(Xi) 
be the index of variable W- At each node N of the search tree, if there exists a 
subset Mj = R\ {Xj} of variables V whose cardinality is n — 1 such that for each 
pair (Xi, Sifix{^(yQ{Xi))) , ^ ~ \,...,ji, i yf j, Dnew(^i) ^ ^i(Sidx{(TQ{Xi))), we can 
remove from the domain of the free variable Xj all values Vk G f^j(Sidx{ao{Xj)))- 
Note that if the subset of variables for which the condition holds has cardinality 
equal to n, the search fails and backtracks. It is easy to see that larger Si in GGSs 
allow more powerful filtering. The filtering algorithm has a time complexity of 
0(n * maxD), where maxD is the size of the largest initial domain. 

As a simple example, consider a problem where we have three variables sub- 
ject to a constraint of difference. Their initial domain is D(Xi) = D(X 2 ) = 
D(X^) = I = [1,2, 3, 4] and they are subject to permutation symmetries. Now 
suppose we find the first feasible solution S'! = {Xi = l,X 2 = 2,^3 = 3}, 
which represents a GGS by definition. Proceeding depth first, we find the second 
solution S2 = {Xi = 1, X 2 = 2, X 3 = 4} (see the search tree in figure [J). There- 
fore, Z\ = ({1},{2},{3,4}) is a GGS. Gonsider the symmetry ao which maps 
Xi in itself, X 2 in X 3 and vice-versa. Upon backtracking, we find a node where 
variable Xi = 1, X 2 = 3 and the domain of variable W 3 contains values {2,4}. 
Thus, we can find a matching of size n — 1 = 2 which maps Xi in (5i = {1} 
since Dnew(Xi) C and X 2 in ^3 = {3,4}, since Dnew(X 2 ) C S 3 . The domain 
of the free variable X 3 (for which Dnew(^z) % ^ 2 ) can be pruned by removing 
all values belonging to S 2 = {2}. In this way we have removed the symmetrical 
solution S'3 = {Xi = 1, X 2 = 3, X 3 = 2}. 

Proposition 1 (Correctness) The filtering algorithm is sound, i.e., it does 
not remove any configuration which is not symmetric with respect to a previously 
found configuration in A = {Ji, . . . , i5„}. Proof: see m- 

Proposition 2 (Completeness) The filtering algorithm is complete, i.e., it 
removes all configurations symmetric with respect to a previously found configu- 
ration. Proof: see W- 

5 Building Global Cut Seeds during Search 

We have shown that GCS can be used to filter values that would otherwise lead 
either to symmetric solutions or to failures. We still need to show how cut seeds 
can be built during search. Failures and already found solutions are treated 
uniformly as dead-ends. In fact, no-goods refer to the unexplored part of the 
search space. 

^ The application of a symmetry to a set 5i = {vi, . . . ,Vm} is cri(Si) = 
(ai(vi), . . . , ai(vm)}. 
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At a given node N, infeasibility is detected when the problem P{N), derived 
from P by the application of the sequence of branching constraint, is inconsistent. 
From the definition of GCSs, we are looking for configurations of assignments 
that are inconsistent for the original problem P. Therefore, we need to find a 
way to distinguish between configurations of values that are inconsistent with P 
and those inconsistent with P{N), but consistent with P. 

This distinction can be easily done if every branching constraint c can be 
solved in one propagation step. In other words, after one run of the propagation 
algorithm associated with c, c can be safely removed from the constraint store. 
This is the case of unary branching constraints that split the domain of a single 
variable (branching variable) in two. A unary constraint c on variable can be 
defined by a set of values Sk that are removed from the domain of variable Xk as 
soon as c is posted. Note that under this hypothesis, at any node N of the search 
tree, any combination of values (ui, . . . , Vn) with Vi G Doid{Xi), i = 1, . . . , n is 
consistent with the set of branching constraints BC{N). 

We focus now on dead-end nodes, i.e., nodes where a failure has been detected 
or forced after a solution has been found. In a dead-end node N only one no- 
good can be built . . . , Let Xk be the branching variable 

and Sk be the set of values removed by Doid{Xk) by the branching constraint 
at node N . It is easy to prove that each combination of assignments taken from 
4^^ = Doid{Xk) \ Sk and = Doid{Xi) for all i = l..n, i ^ k is & no-good, 
thus a GCS because it is inconsistent with the original problem P. We 

will refer to these seeds as simple dead-end seeds. 

In |4] this result is extended, and it is shown that no-goods can also be gen- 
erated by reasoning on values removed by constraint propagation. In particular, 
at each node of the search tree, we can derive at most n no-goods (one for each 
variable with Doid ^ Pnew) each involving n variables. Nevertheless, in practice 
we only use no-goods derived at dead-end nodes. 

We can improve these seeds. As seen in Section 4, seeds composed by larger 
sets bj lead to better pruning of the search space. If is a dead-end node and 
it is reached by applying branching constraints BC{N) to the original problem 
P, it means that P{N) = PU BC{N) is infeasible. Since branching constraints 
remove from the domain of each variable X^ a set of values Si, P{N) is derived 
from P by removing from the initial domain li of each variable Xi the set Si. 
Thus, we define a dead-end global cut seed or simply dead-end seed: 

Definition 3 (Dead-end global cut seed) Let V = {Xi, . . . ,X„} be the set 

of variables of a CSP. Let N be a dead-end node. Let Si be the set of values 
removed by the application of all branchirw constraints of BC{N) on variable 
Xi from the root node to the current nod^j. Then, at node N, a dead-end seed 
Z\(^) can be defined as follows: = {(/i \ S*!), . . . , {Ln \ Sn)} where Li is the 

initial domain of each variable. 



If a variable Xj has not been involved in a branching constraints, the corresponding 
Sj is empty. 
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Proposition 3 (A dead-end seed is a global cut seed) In a dead end 
node N, the dead-end seed assoeiated to N is a global cut seed if branching 
constraints are unary. Proof: see li- 
lt is worth noting that such a dead-end seed subsumes all dead-end seeds 
that could (eventually) be found in children nodes of N. The dead-end seeds 
can be maintained in a list, and every time a new dead-end seed is generated 
at node N, all node seeds corresponding to children nodes of N can be removed 
from the list. 

A further improvement to these cut seeds can be done. The idea is that 
whenever a right branch is selected (a negative branching constraint is imposed) 
the right branch has already been explored and has lead to a failur^. 

Definition 4 (Extended dead-end seed) Let V = {Ai,...,A„} be the set 

of variables of a CSP. Let N be a dead-end node. Let = {SPi, . . . , SPn} be 

the set containing, for each SPi, the set of values removed by the application of 
all positive branching constraints (i.e. the left branches constraints) in BC{N) on 
variable Xi from the root node to the current node. Then, at node N, an extended 
dead-end seed can be defined as follows: = {(/i\S'Pi), . . . , (In\SPn)} 

where li is the initial domain of each variable. 



Proposition 4 (An extended dead-end seed is a global cut seed) In a 

dead end node N, the extended dead-end seed associated to N is a GCS if branch- 
ing constraints are unary. Proof: see 



Proposition 5 (Extended dead-end seeds are limited) If the tree is ex- 
plored depth first, the number of non-subsumed extended dead-end seeds is less 
or equal to the depth of the search tree. Proof: see Jl- 

Note that our approach is independent from the search strategy used. It 
is applicable on discrepancy based tree exploration methods (see e.g., Limited 
Discrepancy Search jS], Depth-bounded Discrepancy Search m) as well as Best 
First strategies. 

Although some classes of problems (for example scheduling problems) typ- 
ically do not rely on unary constraints, in most CSP applications (as well as 
Integer Linear Programming) the search strategy is defined by imposing unary 
constraints on the problem variables. Note that if hypothesis on unary branching 
constraints is relaxed, the cut seeds can be built as well. However, they become 
Local Cut Seeds since they are not valid for the entire search space. Local Cut 
Seeds are subject of current research. 

® In this setting, also solutions are considered since a failure is forced to find other 
solutions upon backtracking. 
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6 Example 

Now consider a simple example on how to build GCS and on the pruning 
achieved. We have three variables subject to a constraint of difference. Their 
initial domain is D{Xi) = D{X2) = D{X^) = I = [ 1 , 2 , 3 , 4 ] and they are sub- 
ject to permutation symmetries. In figure [T] part of the problem search space is 
depicted. 

We first find the first solution (node A) Xi = 1 , X2 = 2 and A3 = 3 . Then, 
a failure is forced. Upon backtracking we build GCSl. Different types of GCSs 
can be built: the simple dead-end seed that records the solution found (i.e., 
domains at node A), the global dead end seed that considers initial domains A 
and removes only values pruned by the unary branching constraints, referred in 
section 5 to as Si. Thus, the set 62, corresponding to X2, is equal to I2 \ 82. S2 
contains values deleted by the branching constraint X2 = 2 . Note that value 1 has 
been deleted by problem constraint propagation before the branching constraint 
on X2 is posted. As a consequence, S2 contains only values 3 and 4 . Thus, 1 
belongs to the GGS. The extended global cut seed in this case is equal to the 
global one since no negative branching constraints have been posted. 

GCS2 is generated after backtracking from node B (after the second solution 
found). The simple dead end seed corresponds to the domains of node B. The 
global dead-end seed instead considers for variable X2 the same set as for GGSl. 
The interesting part here is ^3. In fact, since in node B variable A3 has not been 
involved in any branching decision, the corresponding S3 is empty and ^3 is 
equal to the whole initial domain. Again, the extended dead-end seed is equal to 
the global one since no negative branching constraints have been posted. Having 
generated GGS2, we can remove (in node D) value 2 from the domain of A3. 
In fact, the following mapping exists: D{Xi) C D{X2) C 153. Thus, we can 
remove from 0^X3) values belonging to 82, i.e., values I and 2 . In this way, we 
avoid computing a symmetric solution with respect to the two solutions found. 

GGSS is generated upon backtracking from the third solution. The simple 
dead end seed corresponds again to the solution. The global dead end seed 
instead contains values from the initial domain minus those removed by the 
branching constraints, i.e.. Si = { 2 , 3 , 4 }, S2 = { 2 , 4 } (again value 1 was previ- 
ously pruned by constraint propagation) and S3 is empty. The interesting case 
here is the extended dead-end seed, and in particular the set 82. On X2 a neg- 
ative branching constraint has been posted, but we remove only values pruned 
by its positive counterpart, i.e., SP2 = { 2 }. The generation of the GGS3 lead to 
prune the subtree corresponding to X2 = 4 since it leads to symmetric solutions. 

Simple, global and extended dead-end seed have an increasing filtering power 
since they involve larger sets 8i. 

7 Specialization of the Filtering Algorithm and Related 
Approaches 

We will now describe some specializations of the general filtering algorithm. 
Some of these specialization lead to new symmetry removal algorithms; some 
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GCSl: GENERATED UPON BACKTRACKING FROM NODE A 
Simple Dead End Seed: 81={1), 52={2}, 83={3} 

Global Dead End Seed: 81={1}, 82={1,2}, 83={1,2,3} 

Extended Dead End Seed: 51={1} , 82={1,2}, 83={1,2,3} 
GCS2: GENERATED UPON BACKTRACKING FROM NODE B 
Simple Dead End Seed: 82={2}, 83={3,4} 

Global Dead End Seed: 61={1}, 82={1,2}, 83={1,2,3,4} 

Extended Dead End Seed: 51={1] , 82={1,2}, 83={1,2,3,4} 

GCS3: GENERATED UPON BACKTRACKING FROM NODE C 
Simple Dead End Seed: 81={1}, 82={3}, 83={4} 

Global Dead End Seed: 81={1}, 82={1,3}, 83={1,2,3,4} 

Extended Dead End Seed: 81={1} , 82={1 ,2 , 3 } ,83={ 1 , 2 , 3 , 4 } 



Fig. 1. Example 



others instead show that already known results can be seen as special cases of 
the general algorithm given in Section 4. 

7.1 First Specialization (CUTSl) 

The first specialization considers problems having a set of variables subject to 
permutation symmetries. Without loss of generality, we suppose that the entire 
set of variables is subject to permutation symmetries, and that all symmetric 
variables have the same initial domain. In this specialization, we suppose that 
the branching strategy chooses a variable and a value for the variable. In this 
case, the filtering algorithm outlined in section 0] can be easily implemented. 

Intuitively, we can see that if a value val for a given variable Xi is infeasible, 
being all variables symmetric, val will also be infeasible for any other variable 
Xj. Thus, upon backtracking on the branching choice Xi = val, we can remove 
val from the domain of any other variable not yet bound by branching. 

This algorithm is a special case of the general filtering algorithm proposed. 
Consider a node h, on which a failure is generated by the assignment Xi = val. 
Being Selected the set of the k variables already assigned by branching, and 
being Xi the current branching variable, the node h has the following extended 
dead-end seed: where for each s S Selected, (5s ^ = {us}; 

for each j ^ Selected, = I \ SP^^^ = I, being I the initial domain of all 
variables; finally 6^^'^ = {val}. 

At father node f{h), Dnew{Xs) = {r's} for s G Selected. There exist exactly 
n — k different matchings Mi (for each I ^ Selected) of cardinality n — 1 between 
the CCS and the domain set. Each matching Mi allows to remove = {val} 
from Dnew{Xi). This pruning can be done for each I ^ Selected. Using different 
arguments, Roy and Pachet m found this same algorithm for pruning in case 
of permutation symmetries. Here we deduce the algorithm from the general one 
described in Section [H 

An extension of this algorithm for removing (local) symmetries is that de- 
scribed in |2]. It is based on the idea that if the instantiation Xi = val is tried 
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without success, we can remove Ui(val) from the domain of (jQ^Xi). In fact, given 
the same extended dead-end seed as befor^ we can find in node f{h) only one 
mapping Mi of size n — 1 where I = idx{aQ{Xj^)). Thus, we can remove from the 
domain of ao{Xi) the symmetrical of val. 



7.2 Second Specialization (CUTS2) 

The second specialization still considers the same problems and permutation 
symmetries as for CUTSl. Here we suppose that the branching constraints are 
unary constraints (e.g., X < val, X > val, X yf val). We are, in a way, general- 
izing the results presented in |13j . 

Given . . . , {I\SPn^^)} generated after a failure at node h, 

the domain of variables can be reduced if there exists a matching of cardinality 
n — 1 between the set of domains and The left subtree of node f{h) is 

the fail node h. In the right subtree of node f{h), for each * yf fc, we have 
Dnew{Xi) C Thus, for each j such that Dnewi^k) C a matching Mj 

of cardinality n — 1 exists, and 6^^'^ can be removed from Dnew{Xj). 

Whenever a new GCS A^^') is inserted into the pool, the filtering algorithm 
checks for each j if Dnew{Xk) C At most sizeOf{I) values are removed 
from Dnew{Xj). Therefore, the initial propagation of each GGS can be obtained 
in 0(n sizeOf(I)^). For each variable there are at most sizeOf{I) cuts to be 
considered, if all the instantiations of different values to a variable provide a 
non-subsumed GGS. Whenever a set of values is removed from a domain of a 
variable, at most n checks should be performed for each cut. Each time a check 
Dnew{Xk) C 5^^'^ is successful, at most sizeOf(I) values are removed from 
Dnew(Xj). Therefore, the worst case complexity of the filtering algorithm on a 
single cut for the entire set of permutation symmetries is 0{n sizeO f {I))\ 
computational results show that the mean case complexity is lower. 

7.3 Third Specialization (CUTS3) 

We now give a specialization of the algorithm for symmetries defined by inter- 
changeable values [5]. Let consider a GSP with n variables having interchange- 
able values. Given n variables Xi,...Xn, with initial domain suppose A 
can be partitioned in ki subset of values 5}, ... ,5'^. such that for any couple 
of values belonging to the same partition, va S S^vb S S'* any two solutions 
{vi, . . . ,Vi-i,va,Vi+i, ... ,Vn) and (ui , . . . ,Vi-i,vb,Vi+i, . . . ,Vn) are equivalent. 

GGSs can be built at each failure, leading to A^^^ = {{h\ . . . ,{In\ 

SPn^^)} at node h. SP^^'’ is the set of values removed from variable Xi by 
positive branching constraints. Let Xk be the branching variable at node f(h). 
In the right subtree defined by node f{h), for each i,i ^ k,Dnew{Xi) C 
Thus, a matching of cardinality n — 1 exists with <5^^^ and Dnew{Xk) unmatched. 

The dead-end seed is in fact independent from the symmetry considered. 
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Therefore, for each value v* G 6^^'^ all values v such that v and v* belong to the 
same partition of the domain can be removed from Dnewi^k)- 

In [3], the concept of interchangeable values is extended to cycle of symme- 
tries where values are symmetrical two by two. Therefore, if a value participates 
to no solution, all symmetrical values can be removed as well. Our approach 
applies also in this case, by defining cycle symmetries instead of simple inter- 
changeable values. 

7.4 Forth Specialization (CUTS4) 

One of the first general filtering algorithm for removing symmetries was given in 
jZ]. The authors propose an algorithm, called SBDS, that, under the hypothesis 
that branching decisions are of type Var = val as opposed to Far val, is able 
to remove a given symmetry a. We show that SBDS can be reinterpreted as a 
specialization of our filtering algorithm. 

Let A be the set of already assigned variables at node N (Xi = ViVXi G A); 
and {(JO) Cl, . . . , cr„} be the symmetry handled by SBDS. If Xk is the current 
branching variable at node N, the left branch imposes the constraint X^ = val. 
In jT], the authors show that the following constraints can be added in the right 
branch in order to remove symmetric solutions w.r.t {cto, cti, . . . , ct„}: 

WXi G A{Xi = Vi, ao(X^) = (Ji{{vi}) A Xk yf val ^ (To{Xk) yf ak{{val}) 

This constraint is a special case of the filtering algorithm proposed in Section 
4. To prove that, let 7^, i = 1, . . . , n be the initial domain for variables Xi involved 
in the symmetry a. By definition of symmetry, li = cTi{Iidx{cro{Xi))),i = 1, . . . , n. 

Let now consider the GCS generated by the failure and backtracking from 
the left branch of node N. A = {Ji, . . . , i5„} where 8i = {^i} 'ii\Xi G A, Sk = val, 
Sj = {Ij \ SPj) yj\Xj ^ A,j y^ k, where Ij is the initial domain of variable Xj. 
Positive branching constraints have been applied only to variables in A. Thus 
SPj = 0 yj\Xj i A. Then, Sj = Ij yj\X, i A, k. 

We show that when the left hand side of the implication is true, then the 
propagation algorithm in Section 4 imposes the right hand side to be true as 
well. Let consider the left hand side {Xi = Vi,aQ{Xi) = ai{{vi})yXi G A) A 
Xk yf val to be true. Then the following matching of cardinality n — 1 exists: 
{Pnew{Hi{ , Si^ ,yi\X[ — with Xi G A, in fact and 

Si = {vi}. 

All Sj, \Xj ^ A, j k contain the whole initial domain of the corresponding 
variable {Sj = /yVj|Aj ^ A,j k), and therefore they also find a match. 

Finally, Sk = {val}, and Dnew{Xh)\o'o{Xk) = Xk are unmatched. Therefore, 
the algorithm removes ah{Sk) from Xh- This corresponds exactly to the pruning 
performed by SBDS, i.e., a{Xk yf val). In the same way, it is easy to see that 
when the right hand side of the implication is false, then the left hand side must 
be false as well or a failure will be triggered. 

A similar way of coping with symmetries is that of , which applies for any 
search strategy. When the branching strategy considers unary constraints on 
branching variables, our approach can again be seen as a generalization of that of 
12]. However, when the hypothesis does not apply, their approach is more general 
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provided that they are able to define the symmetrical counterpart for a general 
branching constraint. We are currently studying the extension of our approach 
to the case of non-unary branching constraints leading to the generation of Local 
Cut Seeds instead of Global ones. 



7.5 Removing Families of Symmetries 

A very common characteristic of symmetric combinatorial problems is that they 
present a very large (often exponential) number of symmetries. Consider, for 
example, a problem with permutation symmetries. Since there are n\ different 
permutations (i.e., n! different symmetries), it is clearly impractical to remove 
all symmetries of the problem one by one. In case of exponential number of sym- 
metries, in [Z| the authors propose to remove only subsets of symmetries using 
conditional constraints. With our framework, we can either consider a subset of 
symmetries or a family of symmetries at a time. We believe that the separation 
between GCSs, and SRCs provides the basis for more sophisticated reasoning on 
families of symmetries. We can easily develop (as shown in Section FTTIl a spe- 
cific cut generator for removing permutation symmetries, and the corresponding 
SRC. Using this family of symmetry removal cut we are able to efficiently remove 
all permutation symmetries together. We are currently developing other specific 
cuts for removing the value-permutation family of symmetry and rotation fam- 
ily of symmetry. Similarly to the introduction of global constraints in CP, we 
believe that the development of symmetry dependent cuts dedicated to fami- 
lies of symmetries could greatly increase the performance of CP in symmetric 
combinatorial problems. 



8 Using the Framework 

In this section, we will try to close the gap between the theoretical study of Sec- 
tion [3] and SI and the practical use of our framework. We have implemented the 
framework by using ILOG solver. Symmetric values are pruned through SRCs 
which exploit the information contained in particular no-goods called GCSs. 
Global Cut Seeds (extended dead-end seeds) are created during tree search by 
a cut seed manager. The cut seed manager contains the set of problem vari- 
ables and is informed of the application of the positive and negative branching 
constraints. Positive branching constraints are stored, while negative branching 
constraints trigger the generation of a new global cut seed. 

A symmetry a can be defined by a function returning, for each pair variable- 
value (Xi,Vh), a pair {Xj,Vk) such that Xj = ao{Xi) and Vk = Ui{vh). We built 
a cut generator responsible for generating a symmetry removal cut for a general 
symmetry a for each CCS in the cut seed manager. 

Once the cut seed manager and the cut generator have been provided, any 
type of symmetry can be easily handled by simply defining the function describ- 
ing the symmetry itself, and passing it to the cut generator. 



Global Cut Framework for Removing Symmetries 



89 



9 Computational Results 

Although the main focus of this paper is to provide a general framework for 
removing symmetries, we show some computational results on well known sym- 
metric CSP. The approach was tested on four problems: the n-Queen problem, 
the pigeonhole problem, the Ramsey problem, and the golfer problem. All tests 
run on a Pentium III 600 MHz. 

The n-Queen problem has 7 rotational and mirror symmetries corresponding 
to the symmetries of a chessboard. The all-solutions n-Queen problem has been 
previously solved by [7j using the SBDS method. The use of the extended dead- 
end global cut seeds enables to detect symmetric configurations earlier than 
SBDS. In fact, on this problem, we reduce the number of fails obtained in [2- 
Nevertheless, the conditional constraints used in SBDS are “lighter” than our 
symmetry removal cuts, and therefore the run time of our approach is higher 
than the one presented in [^. The results for the 10 and 12 queens problems are 
shown in table 1 with and without symmetry removal cuts. On this problem, 
the only advantage of the cut generation technique w.r.t. SBDS lies on the fact 
that we are not forced to use a labeling strategy. 



Table 1. Experimental Results on n-Queen problem 



Problem 


sym remov 


symmetric 




Nb Sol 


Time 


Fail 


Nb Sol 


Time 


Fail 


n-Queen 10 


92 


0.52 


868 


724 


0.71 


5451 


n-Queen 12 


1787 


12 


17427 


14200 


15 


107797 



The pigeonhole problem is a classical benchmark for methods aiming at re- 
moving symmetries. This problem has an exponential number of symmetries, be- 
ing all possible permutations of variables. We built a symmetry removal cut able 
to effectively handle the family of permutation symmetries together. We tested 
the approach on problems of size 9 and 10, using a domain splitting branching 
strategy, and a labeling branching strategy. We compare our approach with the 
one proposed in m, where symmetry breaking constraints were used. The re- 
sults show that our approach is competitive with the one proposed in m where 
ordering constraints were added to the model. The results for the 12 pigeons 
problem are reported in table 2, where the first line uses a labeling strategy, 
while the second one uses a domain splitting branching strategy. 



Table 2. Experimental Results on the 12 pigeon problem 



Strategy 


sym rem cut 


sym break cst 




Time 


Fail 


Time 


Fail 


labeling 


0.1 


1024 


0.07 


1280 


dom split 


0.7 


7888 


0.5 


13531 
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The Ramsey problem is another well known symmetric problem. For its def- 
inition and CP model we refer to jl2]. For < 17 the problem is feasible and 
contains a large number of symmetrical solutions; for iV = 17 the problem is in- 
feasible. We have used the same model proposed by Puget in m and previously 
studied in |7|. 

The golfer problem was proposed by W. Harvey and can be found in the 
CSPLib. The problem states as follows: G golfers want to play in Gr groups of 
W weeks each, in such a way that any two golfers play in the same group at 
most once. How many week can they do this for?. An instance is described as 
golfer{Gr, G, wf\. 

To the best of our knowledge, all methods used to solve these problems 
PI add symmetry breaking constraints to the model plus other methods (like 
SBDS). Although our objective is to remove all symmetries using symmetry re- 
moval cuts, we also add symmetry breaking constraints and symmetry removal 
cuts. On both problems we used GCSs to remove variable permutation type of 
symmetries and symmetry breaking constraint to remove other types of sym- 
metries. The results for the Ramsey problem are comparable with the ones in 
PI and the results for the golfer problem improve the ones published in m- 
However, the improvement w.r.t. m is mainly due to a different model of the 
problem. The important point here is the proof that the generation of GCSs, and 
the pruning of global cuts do not penalize the run time obtained with symmetry 
breaking constraint. 

Table 3 shows the results for three Ramsey problems {R6-AU, R7-AU and 
RlT), and for two golfer problems. Problems R6-AU and R7-AU find all solu- 
tions for problems with 6 and 7 nodes. R17 is infeasible and the results refer 
to the proof of infeasibility. In the Ramsey problems only node permutation 
symmetries were removed using our method while color symmetries were han- 
dled as in [12|; combined symmetries on nodes and colors are not handled. Note 
that using a global symmetry removal constraint (removing all node permuta- 
tions) we are able to reduce the number of symmetric solutions found w.r.t. the 
approach proposed in m, and in [2]. Results on golfer(4,3,4) refer to the run 
time and number of fails to find all solutions. Results for the infeasible problem 
golfer(4,3,5) refer to the run time and number of fails to prove infeasibility. The 
task of removing all symmetries in these two problems is subject of current work. 

10 Conclusions and Future Work 

In this paper we propose a method to collect information during search and a 
general filtering algorithm able to use the collected information to prune the 
search space in symmetric CSPs. The method proposed does not interfere with 
problem dependent heuristics, and can be easily implemented in Constraint Pro- 
gramming when commonly verified hypothesis on branching constraint are re- 
spected. The general filtering algorithm can be, in practice, very easily special- 

^ The go//er(4, 3, 4) has a large number of symmetric solutions while groi/er(4, 3, 5) 
has no solutions. 
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Table 3. Experimental Results on the Ramsey problem and Golfer problem 



Problem 


sym rem cut 


sym break cst | 




Sol 


Time 


Fail 


Sol 


Time 


Fail 


R6-all 


1161 


0.43 


864 


7697 


0.9 


27 


R7-all 


20054 


8.1 


14367 


252325 


29.5 


135 


R17 


0 


0.4 


181 


0 


0.2 


636 


golfer(4,3,4) 


16 


2.9 


1235 


16 


1.6 


1235 


golfer(4,3,5) 


0 


3.0 


1078 


0 


1.7 


1473 



ized to obtain constraints able to remove a given set of symmetries, and we show 
that many relevant previous approaches can be reinterpreted as specialization 
of the filtering algorithm proposed. Finally, the separation between the symme- 
try independent GCS data structure and the filtering algorithm allows to treat 
families of symmetry together. We are currently extending this framework to 
the case of non-unary branching constraints, developing symmetry removal cuts 
for other families of symmetries (permutations of values, rotations, etc.), and 
applying global cut seeds ideas in “quasi-symmetric” problems such as job-shop 
scheduling problems. 
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Abstract. Symmetries in constraint satisfaction or combinatorial opti- 
mization problems can cause considerable difficulties for exact solvers. 
One way to overcome the problem is to employ sophisticated models 
with no or at least less symmetries. However, this often requires a lot of 
experience from the user who is carrying out the modeling. Moreover, 
some problems even contain inherent symmetries that cannot be broken 
by remodeling. We present an approach that detects symmetric choice 
points during the search. It enables the user to find solutions for complex 
problems with minimal effort spent on modeling. 

Keywords, symmetry breaking during search, graph partitioning, 
n-queens problem, golfer problem 



1 Introduction 

Symmetries can give rise to severe problems for solution algorithms as equivalent 
search regions are unnecessarily being explored more than just once. There are 
several ways of handling symmetries. One is to model the problem in such a way 
that no or at least less symmetries remain. This may also imply the adding of 
constraints which will only be satisfied by one assignment in each equivalence 
class. The major disadvantage of this approach is that it requires the user to have 
a certain level of experience, and sometimes it is even not possible to remove 
symmetries from a problem formulation as they are inherent to the given prob- 
lem. Another way to break symmetries is to add constraints during the search 
for a solution. Those constraints can e.g. be derived from functions mapping 
single assignments to their symmetric versions. 

We refer to all methods that avoid the exploration of symmetric parts of 
the search space as symmetry breaking strategies. However, there is of course a 
difference between approaches adding constraints to the model either statically 
or dynamically, and pruning/propagation approaches like the one we describe in 
this paper. Whenever a complete search is performed, all those methods have 
the same effect in that they do not expand symmetric choice points. Note, that 

* This work was partly supported by the German Science Foundation (DFG) project 
SFB-376, and by the 1ST Programme of the EU under contract number IST-1999- 
14186 (ALGOM-FT). 
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for incomplete searches, the approach presented here may be used as well, but 
requires caution with respect to the handling of previously visited search nodes 
(see Section . 



1.1 State of the Art 

Whereas model reformulations have been used successfully for quite a few spe- 
cific problems in combinatorial optimization or constraint satisfaction, only very 
little research has been carried out on the topic of breaking symmetries system- 
atically. In [7], Rothberg presents ways to remove symmetries from mixed integer 
problems (MIPs) by using cuts. Sherali and J.C. Smith discuss the effectiveness 
of adding constraints to a basic model in a number of case studies [ID]. In m, 
Gent and B. Smith develop a generic approach called SBDS. In every choice 
point, SBDS may extend the model dynamically by adding symmetry breaking 
constraints. For a combinatorial design problem, this approach has been shown 
to be efficient in combination with refined problem formulations which are used 
to remove symmetries already in the model HD. As the number of symmetries 
in the given problem is enormous, the approach presented is not able to detect 
all of them and thus also gives non-unique solutions. In [6], Meseguer and Tor- 
res introduce a symmetry avoiding approach that works by adapting the search 
strategy. 

We introduce a method that detects symmetric choice points within the 
search procedure. Every time the search algorithm generates a new choice point, 
we check if it is equivalent to or dominated by a node that has been expanded 
earlier. If so, the current choice point can be pruned. If not, it is processed nor- 
mally. By checking whether a value assignment to a variable yields a symmetric 
search node, we can also use symmetries to shrink the domains of variables. How- 
ever, that propagation can be very costly and thus is not suited in all cases. As 
the method is based on the detection of dominance relations between subtrees, 
we call it Symmetry Breaking via Dominance Detection (SBDD). 

The remaining part of the paper is structured as follows: In Section we 
formally introduce the SBDD approach. In the Sections [S] H] and0 it is applied 
to three different examples from combinatorial optimization and combinatorial 
design. Numerical results are given that circumstantiate the effectiveness of the 
approach. 

2 Breaking Symmetries 

The goal of breaking symmetries is to avoid the exploration of a search space A 
that can be mapped into a previously considered part □ via a symmetry function. 
Because if □ does not contain any solution, nor does A. And otherwise, all 
solutions in A are symmetric to those already computed during the investigation 
of □. Thus, symmetries can be used to prune the search tree, and also to remove 
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values from variable domains that would yield the search to a symmetric part 
of the search space. 

Before we outline the concept more formally, first we introduce some helpful 
definitions. 

Definition 1. Let X = {a;i . . .Xn\ denote the set of variables of the model to 
solve, D{x) denote the domain of variable x G X. 

The tuple P‘^ = (D'^{xi), . . . , D‘^{xn)) denotes the current state in choice 
point c. We refer to the representation os a pattern. 

Definition 2. Let P‘^ = (D^{xi), . . . , D‘^{xn)) , P‘^ = {D° (xi), . . . ,D‘^ (xn))- 

— We say that P‘^ includes (P^^ Q P^^ ), iff x € X : D“(x) C (a;). 

— FFe set MV'' := Dffxi) x • • • x Dffxn). 

— Given a symmetry mapping function Lp : MV MV, we say that P" 
dominates P" (under the symmetry ip), iff ip{P") Q P" ■ Then, we write 

pc □ pc' 



Property 1. Given two choice points c and d , where c' is a successor of c in the 
search tree. Then it holds: P" G P". 

To ease the presentation, in the following we assume that the partitioning of 
the search space is achieved by using unary branching constraints. However, the 
concept can be generalized by adding information on the branching constraints 
active in a search node to the definition of a pattern that is used to reflect 
the current situation in the search. Then, the definitions of symmetry mapping 
functions etc. have to be adapted accordingly. 

The approach we suggest for the pruning of symmetric parts of the search 
space is based on the following ingredients: 

— A database T that stores information on the search space already explored. 

— A problem specific function T> : {P^,P°) — > {false, true} that yields true 
iff the pattern P^ is dominated by P° under some symmetry function p. 

— If symmetries shall also be used for propagation, a similar function is needed 
that, for all variables x, removes all values b from the domain of x for which 
<P{P^[x = b],P°) = true. 

In every choice point, we check whether the current pattern is dominated 
by some pattern in T. And if so, the current node is pruned. Otherwise, we 
can use the function <1> for propagation. Thus, we perform Symmetry Breaking 
via Dominance Detection (SBDD). Figure [T] visualizes the general procedure. 
White nodes are still active, black nodes have been fully expanded already. Boxes 
represent patterns in T, circles are patterns not or no longer contained in T. 
Finally, A marks the current node. Originally, a pattern A must be checked 
against all fully expanded nodes (see Figure (HJi)). 

Obviously, it is problematic if we are to store all expanded nodes in T. In 
the next Section, we describe how to handle T efficiently for depth first search 
(DFS). Then, we generalize the result to arbitrary search strategies. 
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Fig. 1. The concept of SBDD 



2.1 Efficient Realization in a Depth First Search 

The key for an efficient realization of the general SBDD concept as described 
above is the observation that, within a DFS, we do not need to keep the in- 
formation of all previously expanded nodes in the search tree. Instead, we can 
merge sibling entries in T on backtracking, thus summarizing and compressing 
the information gathered. 

Lemma 1. Let c be a choice point with state 

where i is the index of the branching variable in c, and D‘^{xi) = {v\, . . . ,vi} C 
D{xi). Further, denote with . . . , {vk}, ■ ■ ■ , D^’^{xn)) V 1 < fc < / 

the states of the children ci, ... ,Cn of c. Finally, let P'^ the state in choice point 
d with P'^ C for some 1 <k <1. Then, it holds P"^ Q P‘^. 

Proof. For all a; G df it holds that D‘^'‘{x) C D‘^{x). Thus, g){P‘^') C C P'=. 

Using Lemma [U SBDD in combination with DFS can now be realized effi- 
ciently: We start with T = 0 and process each choice point as follows: 



1. Check the pattern P“ of the current choice point c against all patterns in T. 
If 3 P G T : F{P‘^,P) then fail. (Alternatively encapsulate this function in 
a constraint and use it for propagation as well.) 

2. (normal processing within the choice point) 

3. on backtracking: if there are more siblings to be expanded, then add the 
current pattern to T, else delete all patterns of the other siblings from T. 



To subsume, when using DFS the current pattern needs only be compared 
with patterns left-adjacent to the path from the root to A (see Figure dlb)). 
Notice, that step[5]refers to the normal processing of a choice point that also takes 
place when no additional symmetry breaking framework is utilized, including the 
choice of a branching variable and the exploration of the children. 

The efficiency of the approach mainly depends on the number of patterns 
that have to be checked. The number of patterns is at most as large as the depth 
of the search tree times the cardinality of the largest domain. 
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2.2 Arbitrary Search Strategies 

Referring to the discussion on the size of T, it seems to be impractical to combine 
SBDD with search strategies other than DFS, because the number of previously 
expanded nodes, and thus the size of T may be enormous. Or, the method 
becomes ineffective, because many nodes are closed late, as it is the case for 
breadth first search, for instance. 

Nevertheless, with a slight modification, it is possible to cope with general 
search strategies. Let c be the current choice point, and P‘^ the corresponding 
pattern. The idea now is to check whether a symmetry function maps P‘^ to 
a pattern of a choice point c' that would have been processed before c if DFS 
would have been applied on a statical variable ordering (see Figure ([It)). If 
so, c is rejected, otherwise we proceed normally. Like that, we prune the tree 
because we detect that the work has either been carried out already or because 
we decide to do it later. Notice, that the current path in the search tree contains 
all information necessary to identify the patterns that are relevant for checking. 
The assumption of a statical variable ordering defines an ordering of all choice 
points. The approach rejects the current choice point iff a dominating pattern 
exists left of it in a DFS tree, i.e. iff the current choice point is greater than 
one that already has been or will be explored later. As an exhaustive search 
eventually will consider the leftmost nodes as well, we can be sure not to miss a 
solution. 

Notice, that the search strategy is slightly affected by this procedure, because 
the exploration of choice points can be postponed by the symmetry breaking 
algorithm. However, one might expect a reasonable search strategy to rate sym- 
metric parts of the search tree as equally important. In that case, the expanding 
of the current choice point is only postponed formally, but in fact is carried out 
next in a symmetric version. 

After having outlined the general approach, in the following Sections we apply 
it to three different applications in the field of combinatorial optimization and 
constraint satisfaction. 

3 Graph Partitioning 

The first application of the method described in Section O is the graph biparti- 
tioning problem. Given an undirected graph G = (V, E), the graph bipartitioning 
problem asks for a set V CV such that the number of nodes in V and V \V , 
differs at most by one, and the number of edges between both sets is minimal. 
This optimal number is often referred to as the bisection width of the graph. 
Graph bipartitioning is known to be NP-hard, exact solutions can only be com- 
puted for small graphs, i.e. \V\ < 200. Interestingly, graph bipartitioning alone 
already induces a symmetry as the sets V and V \ V can be exchanged. 

An obvious symmetry breaking strategy in this case is the assignment of 
node 0 to set V . Unfortunately, if graph G itself introduce symmetries, such an 
assignment does not break the resulting combined symmetries. 
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In parallel computing, connection networks are typically nicely structured 
and their symmetries are known. Graphs of the hypercube family have been 
studied intensively (see | ll5| b One popular network is the so-called de Bruijn 
network which is defined as follows: 

Definition 3 (de Bruijn Network DB(fe)). The de Bruijn Network of di- 
mension k is a directed graph DB{k) = The edge set can be described 

best by associating the nodes with their corresponding binary representation, i.e. 
14 = {(6o...6fc_i)G{0,l}'=}. Then, 

Ek = {{ba, ab), {ba, ab) | a G {0, b G {0, 1}} 

where b denotes inverting bit b, i.e. b = 1 — b. 





Fig. 2. de Bruijn networks of dimension 3 (left) and 4 (right). A node is marked by 
the binary string corresponding to its number. The dashed lines mark the symmetries 
of the de Bruijn network. 



DB(fc) contains 2^ nodes, each having degree 4, and 2^+^ edges. Furthermore, 
DB(/c) contains 3 symmetries described by the following automorphisms: 



ai : V - 


V, {bo, bi, . 


■ ■ , bk-i) {bk-i,bk-2 


,... ,bo) 


a2 ■ V - 


->■ y, {bo, bi, ■ 


..,bk-i) {bo,bi,..., 


bk-i) 


as :V - 


->■ y, {bo, bi, . 


■ ■ , bk-i) {bk-i,bk-2 


,... ,bo) 



Symmetries cti, (T 2 and are visualized in Figure |2[ where DB(3) and DB(4) 
are shown. 

In the following, for the graph partitioning problem we will interpret any 
directed arc of DB(fc) as an undirected edge. 

3.1 Bisection Width of the de Bruijn Graph 

It can be shown that the bisection width of DB(fc) is 0(^), but there are only 
few results for concrete graphs. In [3], an optimal bisection width of 30 for DB(7) 
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has been computed. At the time that paper was written, the algorithm based 
on LP bounds ran for about two weeks. To our knowledge, no exact bisection 
widths for bigger de Bruijn networks were known at that time. 

Recently, Sensen improved the well known bound based on clique embed- 
dings (equivalent to 1-1 multi-commodity flows) by introducing variable multi- 
commodity flows. Using interior point methods for the resulting linear programs, 
he was able to prove an exact bisection width of 54 for DB(8). The symmetry 
detection routine described in Section [2] was used to avoid the consideration of 
symmetric parts of the search space. We refer to [S] for details on the overall 
approach. Here, we concentrate on the symmetry breaking. We use this example 
to show an easy application of SBDD rather than to underline its efficiency. For 
comparisons with SBDS we refer to Sections ID and [3 

3.2 Symmetry Breaking for Graph Partitioning 

When bipartitioning de Bruijn networks, seven symmetries have to be encoded in 
They stem from the three automorphisms of the network itself, the exchange 
of V against V \ V and the combination of these symmetries. 

For the graph bipartitioning problem, a pattern is implemented as an n-tuple 
p G {0, 1, *}". Pi = Q {pi = 1) means, that node i G V' {i G V \ V'). Pi = * 
means, that node i has not been assigned yet. The symmetry functions pi, . . . 
permute the nodes according to cti, CT 2 or and/or invert the entries. A pattern 
is dominated by P° iff there is a symmetry function 1 < fc < 7 such 
that for all 0 < J < n it holds Pk(P°)i = * or Pf^ = pk{P°)i- 

It is also possible to use pattern information for propagation. Assume that 
there is a symmetry function pk and an index j, 0 < j < n, such that pk{P°)i = 
* or P^ = Pk{p°)% VI < i < n,j yf j and pf = *. Let pk{p°)j = 0 (or 
Pk{p°)j = 1). Then we can force that node j is in U \ V (or V , respectively). 




Fig. 3. The search tree for DB(8) bipartitioning when breaking all possible symme- 
tries. Notice, that chains of choice points with only one successor result from detecting 
symmetric parts that are not explored. 



Figure 0 and m show the different branching trees resulting from a computa- 
tion of DB(8) with and without breaking symmetries. As expected, the search 
tree is much smaller in the first case. Notice, that huge parts of the solution 
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Fig. 4. The search tree for DB(8) bipartitioning without breaking any symmetries. 



space are cut off by lower bound information. Thus, many symmetric subtrees 
are pruned early, thereby diminishing the effect of symmetry breaking. However, 
since in this approach the effort per choice point is very high due to expensive 
bound computations (« 14 minutes per choice point), any reduction of the tree 
size reduces the overall cpu time consumption significantly. Thus, for the com- 
putation of the bisection width of DB(8), the breaking of symmetries was able 
to reduce the running time by roughly 2 days, whereby the remaining overall 
computation time then took 37.5 hours. 

4 The Golfer Problem 

We also applied SBDD to find solutions for the Golfer Problem (Problem 10 in 
CSPLib 12) which is: 

32 golfers want to play In 8 groups of 4 each week, In such way that any 
two golfers play in the same group at most once. How many weeks can 
they do this for? 0 

This problem can be generalized by parameterizing it to w weeks and g groups 
of s players each, written as g-s-w from now on. In case of (s — l)?c = t/s — 1, 
we achieve a specification where every player must play with every other exactly 
once. This problem is also known as the Schoolgirl Problem (see Section f4.3jl . 

4.1 Symmetries in the Golfer Problem 

Obviously, there is a lot of symmetry in the problem. First, players can be placed 
at any position within a group (fp), groups can be exchanged within their week 
and also the weeks can be ordered arbitrarily {(fw)- Furthermore, the 
players can be permuted (g’x)- 

Following the idea that symmetry detection should also work well in combi- 
nation with simple models, we have chosen a straightforward one that can be 

^ In the original problem it is clear that the golfers cannot play for more than 10 weeks. 
On the other hand, a solution for 5 weeks can be found easily without backtracking 
by always choosing the first possible player for a group in each week. 
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implemented with little effort using the ILOG Solver environment. The groups 
are modeled as sets of players with the cardinality of each set fixed to s. Each 
week contains g such sets, and the full pattern covers w weeks. To shrink the 
search space, we fix all players in the first week in increasing order. Additionally, 
we insert the first s players into the first s groups for all weeks thereafter. Finally, 
the first group of the second week is filled with the smallest players possible. All 
these assignments can be made without increasing the complexity of the model 
nor losing unique solutions. 

4.2 Breaking Symmetries 

By using set variables for each group, the model does not contain symmetry (pp 
anymore. To detect the domination of patterns with respect to the other sym- 
metries, we describe three symmetry detection functions <Pcj ^w,G and ^w,G,x^ 
that are used during the search. Function <Pw,G includes checks performed by 
<1>G, and ^w,G,x includes those done by 'Pw,G- 

Given two week indices 1 < i,j < w, <Pg is used to check if a week i of 
pattern P° dominates week j of pattern with respect to symmetry ipc- 
This is done by checking whether all groups of week i of pattern P° can 
be mapped to groups in week j of pattern P^ . In the example shown in 
Figure [3, week 2 of pattern P° cannot be mapped to week 1 of pattern 
P^ , because players 1 ans 2 are in the same group in pattern but are 
in different groups in pattern P° . A similar reasoning for players 2 and 3 
prevents mapping week 3 of pattern P° to week 2 of pattern P^ . However, 
week 2 of pattern P° can be mapped to week 2 of pattern as the latter 
is just a specialization of the first. 

^w,G To break symmetries (pw and (pc, function <Pw,G constructs a bipartite 
graph G containing a node for each week of P° and P^ . An edge is inserted, 
iff a week of P° dominates a week of P^, which is determined using pc- If 
G contains a matching of cardinality w, P° dominates P^. Again, Figure 0 
shows an example. 

^w,G,x Incorporating also the last symmetry px results in a huge computa- 
tional effort, as <Pw,G has to be applied for (g • s)! different permutations. To 
reduce the cost of this check, we use the fact that the first week of a pattern 
is always complete due to the fixed entries. Since it has to be matched to 
some other week, “only” possibilities are left. However, the test re- 

mains expensive. Therefore, we tried some variations reducing the frequency 
when (Pw,G,x is applied. A parameter q can be set to restrict full symmetry 
checks to every g-th level of the search tree. Optionally, it can be limited to 
be performed on full patterns, i.e. leaves, only, which is the default. 

4.3 Numerical Results 

The model described has been implemented in ILOG Solver 5.0 and run for dif- 
ferent configurations on a Sun Enterprise 450 (400 MHz UltraSparc-H) . Tables [T] 
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week 1 




Fig. 5. The left hand side shows two patterns and Each pattern consists of 
three weeks (horizontal) of three groups of three players. Unbounded variables are left 
empty. On the right hand side, the corresponding bipartite graph is shown, containing 
a node for each week of both patterns. Since a matching of cardinality 3 exists (bold 
edges), is dominated by P°. 



and [2] show the results of the experiments. Apart from the time (in seconds) 
needed to find the first solution (ti) and the time to find all solutions (tail), the 
number of calls to the symmetry detection functions <l>w,G and ^w,G,x is given. 
In the sym-section, is applied to check for symmetries tpw and ipc in each 

node of the search tree. Since symmetries (px are not detected, there are many 
non-unique solutions found. In the nosym-section, (!>w,G is also applied in every 
node of the search tree, and additionally 'l>w,G,x is applied in leaves preventing 
symmetric solutions from being written out. The Tables continue with the num- 
ber of detected symmetries (symmetries), the number of choice points (ep), and 
the number of fails. Since we are using a very simple model for the problem, an 
approach that does not prevent the exploration of symmetric parts of the search 
tree is not applicable in practice as shown in m- Therefore, a comparison with 
such an approach is left out here. 

Since invoking the symmetry detection function <Pw,G,x is computationally 
very expensive, applying it in every search node does not improve the overall 
runtime, although the number of choice points is reduced. Thus, there is a trade- 
off between the reduction of choice points and the effort spent for the detection 
of symmetries. We have tested a scheme that applies ^w,G,x not only in leaves 
but also performs additional checks for all symmetries in every node in the g-th 
level of the search tree. Table 0 shows that invoking ^w,G,x too often rather 
increases the overall runtime, but applying it too rarely (e.g., only in leaves) is 
not the best choice, either. For the 4-4-4 problem, an invocation in about every 
8-th level has shown to be the best. Similar observations have been made for 
other instances as well. Table Ulshows the improved running times for the 4-4-X 
problem. 
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Table 1. Results of the golfer 4-3-X problem. 



problem solutions 


ti 


iall 


^W,G,X 


symmetries 


cp 


fails 


sym 


4-3-2 


48 


0.00 


0.03 


226 


0 


0 


195 


148 


4-3-3 


2688 


0.02 


6.09 


99454 


0 


0 


28299 


25612 


4-3-4 


1968 


0.05 


26.70 


382120 


0 


2808 


94845 


92878 


4-3-5 


0 


0.00 


36.34 


412456 


0 


3120 


100389 


200390 


nosym 


4-3-2 


1 


0.00 


0.04 


226 


47 


47 


195 


194 


4-3-3 


4 


0.01 


10.00 


99454 


2687 


2684 


28299 


28296 


4-3-4 


3 


0.04 


29.18 


382120 


1967 


4773 


94845 


94843 


4-3-5 


0 


0.00 


36.28 


412456 


0 


3120 


100389 


200390 



Table 2. Results of the golfer 4-4-X problem. 



problem solutions 


ti 


tall 


^W,G 


$W,G,X 


symmetries 


Cp 


fails 


sym 


4-4-2 


216 


0.00 


0.09 


735 


0 


0 


555 


340 


4-4-3 


5184 


0.01 


8.71 


74175 


0 


0 


43755 


38572 


4-4-4 


1296 


0.01 


20.53 


140595 


0 


1296 


82635 


81340 


4-4-5 


432 


0.01 


25.90 


132531 


0 


2160 


75723 


75292 


4-4-6 


0 


0.00 


30.76 


114027 


0 


0 


72267 


72268 


nosym 


4-4-2 


1 


0.01 


0.17 


735 


215 


215 


555 


555 


4-4-3 


2 


0.01 


136.31 


74175 


5183 


5182 


43755 


43754 


4-4-4 


1 


0.01 


22.09 


140595 


1295 


2591 


82635 


82634 


4-4-5 


1 


0.02 


26.51 


132531 


431 


2591 


75723 


75723 


4-4-6 


0 


0.00 


30.71 


114027 


0 


0 


72267 


72268 



Table 3. Results of the golfer 4-4-4 problem performing additional checks for symmetry 
ipx in search tree nodes of every g-th depth. 



level of<Pw,G,x 


solutions 


tl 


tall 


$W,G 


^W,G,X 


symmetries 


Cp 


fails 


nosym 


1 


1 


0.01 


698.51 


0 


26 


18 


82 


82 


2 


1 


0.02 


271.35 


29 


27 


24 


123 


123 


4 


1 


0.02 


101.26 


156 


79 


79 


339 


339 


8 


1 


0.01 


14.51 


5292 


1296 


1296 


4730 


4730 


leaves 


1 


0.01 


22.09 


140595 


1295 


2591 82635 


82634 
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Table 4. Improved results of the golfer 4-4-X performing additional checks for sym- 
metry (fix in search tree nodes of every 8-th depth. 



problem solutions 


ti 


iall 


'f’w,G 


^W,G,X 


symmetries 


cp 


fails 








nosym, level of <f‘w,G,x - 


= 8 






4-4-2 


1 


0.00 


0.17 


735 


215 


215 


555 


555 


4-4-3 


2 


0.01 


134.10 


5283 


1298 


1297 


6492 


2891 


4-4-4 


1 


0.01 


14.51 


5292 


1296 


1296 


4730 


4730 


4-4-5 


1 


0.02 


15.68 


5291 


1295 


1296 


4722 


4722 


4-4-6 


0 


0.00 


17.16 


5290 


1295 


1295 


4714 


4715 



SBDS versus SBDD. In m, an SBDS approach is developed for the golfer 
problem. As has been mentioned before, to break symmetries SBDS inserts ad- 
ditional constraints to the model during the search, and hands them over to the 
solver. Even in combination with complex models, due to the large amount of 
symmetries in the golfer problem, the approach presented is not able to add all 
constraints necessary to break all symmetries. However, SBDS allows ro reduce 
the number of search nodes significantly. 

When using SBDD for the golfer problem, it is possible to find unique solu- 
tions only, even in combination with a very simple model. Obviously, the perfor- 
mance of the approach presented here can be further improved by using more 
sophisticated problem formulations. However, the focus in this paper was not to 
develop a most efficient approach for the golfer problem, but to present a method 
for symmetry breaking that can be used efficiently also by inexperienced users 
and in combination with simple models. 

We are currently working on an approach combining SBDD and a refined 
model for the golfer problem, that is able to solve the so called schoolgirl problem. 

In 1850, Thomas Kirkman stated the following problem, which in fact is 
equivalent to the golfer 5-3-7 problem: 

How can 15 schoolgirls walk in 5 rows of 3 each for 7 days so that no girl 

walks with any other girl in the same triplet more than once? 

Preliminary experimentation shows that this approach is able to compute all 
7 unique solutions to the schoolgirl problem in less than 2 hours. 

5 The n-Queens Problem 

Finally, we consider the classical n-queens problem. It consists of placing n 
queens on a n x n chessboard such that no two queens can capture each other. 
That is, no two queens are allowed to be placed on the same row, the same 
column, or the same diagonal. 

Nowadays it is possible to find one solution using CP for 1 000-queens in a few 
seconds. Asking for all non- symmetric solutions of n-queens requires some more 
effort. In the following, we describe the SBDS approach of Gent and Smith |1] 
on the n-queens problem and compare it to SBDD. 
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5.1 Breaking Symmetries in n-Queens 

It is easy to see that the n-queens problem incorporates seven symmetries, 
namely reflections in the horizontal and vertical axis, reflections in the main 
diagonals, and rotations through 90°, 180°, 270°. 



SBDS. In |4], SBDS is introduced first and tested on a variety of problems. The 
approach is general and compatible with different search strategies. A user of the 
concept only needs to provide symmetry functions mapping a single assignment 
to its symetric version. 

In a choice point where we assign, a: = u on the left and x v on the right 
branch, SBDS adds all constraints that are necessary to prevent the solver from 
exploring a subtree symmetric to an already investigated one. By keeping track 
of all already broken symmetries, only necessary constraints are posted, thus 
keeping the overhead small. 

SBDD. For the n-queens problem, a pattern p is an n-tuple where pi is the 
column number in which the queen covering row i is placed, or, in case the 
position of the queen in row i has not been set yet, pi = *. E.g., the pattern 
corresponding to the first chessboard in Figure El is p = (0, 4, 1, 5, 2, 6, 3). 
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Fig. 6. Six out of forty solutions of 7-queens are unique 



5.2 Experimental Evaluation 

For our experiments, we used the following standard model for n-queens: 

— Each row i = 0,...,n— lis represented by an integer variable Xi. Assigning 
Xi = j corresponds to placing a queen in row i and column j. 

— Additional integer variables yt and tCi,j = 0,...,n— 1, are used to check the 
diagonals of the chessboard. We post the constraints yi = Xi + i, Wi = Xi~ i. 

— The domains are a: G {0, . . . , n — 1}, y S {0, . . . , 2n}, w G {— n, . . . , n}. 

— AllDiff constraints on x, y, and w ensure that no two queens can capture 
each other. 

In contrast to the algorithm we developed for the golfer problems, here we 
use symmetry also for propagation. A constraint is posted to the model that 
keeps track of the current situation in the search. As propagation turned out to 
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Table 5. Solving n-queens without breaking symmetries (sym), with breaking sym- 
metries via SBDS, and by avoiding them via SBDD. Computing times are given in 
seconds. 



n 


solutions 


sym 

fails 


time 


solutions 


SBDS 

fails 


time 


SBDD 
fails time 


4 


2 


4 


0.01 


1 


3 


0.00 


6 


0.00 


5 


10 


4 


0.00 


2 


4 


0.00 


13 


0.00 


6 


4 


35 


0.01 


1 


11 


0.02 


31 


0.01 


7 


40 


69 


0.02 


6 


19 


0.01 


56 


0.02 


8 


92 


289 


0.04 


12 


63 


0.01 


130 


0.03 


9 


352 


1111 


0.16 


46 


216 


0.04 


397 


0.08 


10 


724 


5072 


0.57 


92 


851 


0.13 


1464 


0.29 


11 


2680 


22124 


2.49 


341 


3808 


0.53 


5991 


1.26 


12 


14200 


103956 


11.88 


1787 


17673 


2.52 


27731 


6.27 


13 


73712 


531401 


61.56 


9233 


89534 


12.55 


140348 


33.11 


14 


365596 


2932626 


337.00 


45752 


483214 


69.62 


746530 


189.07 


15 


2279184 


16920396 


1946.07 


285053 


2784876 


403.16 


4391877 1213.36 


16 


14772512 105445065 12154.60 


1846955 17277508 2608.51 


27153758 7463.62 



be rather expensive, we limited the number of calls to the propagation routine 
to one. 

We also implemented a version of SBDS and tested it on the model described 
above. Both codes were running on the same Sun Enterprise as the program for 
the golfers problem in Section 2] 

Table 2] compares the number of solutions, the number of fails, and the com- 
putation time for calculating all solutions (sym), calculating only unique solu- 
tions via SBDS, and unique solutions using SBDD, respectively. We omit the 
number of solutions for SBDD as it is identical to SBDS. The results given for 
SBDS are similar to those given in [4]. Only the number of fails slightly differs, 
which we expect to be caused by small variations in the implementation and the 
different CP engines used (Solver 4.3 vs. Solver 5.0). 

Obviously, SBDD does not perform as well as SBDS on the n-queens problem. 
The reason for this is, that the number of symmetries is rather small (compared 
to the golfer problem) , which makes the application of different additional sym- 
metry breaking constraints on backtracking favorable. 

6 Conclusion 

We have suggested an approach for breaking symmetries that is based on the 
detection of dominance relations between choice points. The method is generally 
applicable and works in combination with all exhaustive search strategies while 
it may overrule strategies other than DFS. Moreover, it removes symmetric parts 
of the search tree efficiently in combination with any model. Thus, it can also be 
used easily by inexperienced users on straightforward models that do not break 
symmetries themselves. 
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The ease of use mainly results from the fact that it is only necessary to 
define the pattern structure and a function that checks if one pattern dominates 
another. This algorithmic approach allows somewhat more flexibility than a 
model that breaks symmetries itself, as has been demonstrated for the golfer 
problem when adapting the frequency of certain symmetry considerations. 

The method has shown to be easily applicable without causing a big im- 
plementation overhead on three very different applications from combinatorial 
optimization and constraint satisfaction. Moreover, it worked efficiently even in 
combination with very easy models and also on highly symmetric problems. 

As a disadvantage, the use of patterns appears to be less efficient on trans- 
parent and - with respect to symmetry considerations - manageable problems 
such as the n-queens problem. There, the dynamic adding of constraints in an 
SBDS fashion is clearly favorable. 
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Abstract. To denote a (3,l,2)-conjugate orthogonal idempotent latin 
square of order n, the usual acronym is (3,l,2)-COILS(n). Up to now, 
existence of a (3,l,2)-COILS(n) had been proved for every positive in- 
teger n except n = 2, 3, 4, 6, for which the problem was answered in the 
negative, and n = 10, for which it remained open. In this paper, we use 
a computer program to prove that a (3,1,2)-COILS(10) does not exist. 
Following along the lines of recent studies which led to the solution, by 
means of computer programs, of many open latin square problems, we 
use a constraint satisfaction technique combining an economical repre- 
sentation of (3,l,2)-COILS with a drastic reduction of the search space. 
In this way, resolution time is improved by a ratio of 10^, as compared 
with current computer programs. Thanks to this improvement in perfor- 
mance, we are able to prove the non-existence of a (3,1,2)-COILS(10). 



1 Background and Notations 

Over the last decade, it has become apparent that the field of finite algebra could 
largely benefit from computational techniques. Particularly in the area of latin 
squares, theorem-proving or constraint-satisfaction techniques have led to the so- 
lution of many open problems I6I7I11I3I5I . A quite extensive recent survey on the 
latin square is provided by F. E. Bennett and L Zhu jl]. A latin square may be 
defined as an n x n grid with each integer 0, 1, . . . , n — 1 appearing exactly once in 
each row and column. Such a grid can be viewed as defining a binary operation, 
say *, on the set S' = 0,l,...,n — 1; and the above property means that S with * 
is a quasigroup, i.e. equations are uniquely solvable: for a,b G S there is a unique 
X G S such that a*x = b, and a unique y G S such that y*a = b. The operation 
of a quasigroup is often written multiplicatively and referred to as a product; 
note, however, that associativity is not assumed. Henceforth we shall consider 
only quasigroups with S as underlying set. The quasigroups Qi =< S,* > and 

* This work was supported by Advanced Micro Devices Inc. 

T. Walsh (Ed.): CP 2001, LNCS 2239, pp. 108- 11^ 2001. 
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Q2 =< S,o >, with binary operations * and o, are isomorphic iff there is a 
permutation g of S such that for all a,b, g{a * b) = g{a) o g{b). A property 
much studied over many decades, especially because of statistical applications, 
is the orthogonality of two quasigroups. Q\ =< S,* > and Q2 —< S,o > are 
said to be orthogonal iff for all a, 6, c G S,a*b = c*d and a o b = cod to- 
gether imply that a = c and b = d. In other words, when superimposing the 
multiplication tables of Qi =< S,* > and Q2 =< S,o >, every ordered pair 
of integers occurs exactly once among the pairs (a, b) G thus formed. 
Noteworthy special orthogonal quasigroup pairs are the so-called conjugate or- 
thogonal quasigroups. A conjugate Q2 of a quasigroup Qi is obtained by carrying 
out a given permutation on the three elements of the multiplication table: index 
of row, index of column, and value of the product. That is for example, given 
Qi =< S, * >, for any a,b,c G S such that a*b = c, the so-called 312-conjugate 
is the quasigroup denoted by Q2 =< S, *312 > such that c *312 a = b. Thus 
from a quasigroup < Q,* >, six conjugate quasigroups can be obtained, corre- 
sponding to the six permutations of three elements. These conjugate quasigroups 
are denoted in a self-explanatory way: < Q, *123 >, < Q, *132 >, < Q, *213 >> 
< Q,*231 >, < Q,*312 >, < Q, *321 >• In a conjugate pair (Qi,Q2) with spec- 
ified permutation cr, Q2 is redundant, so for brevity we talk of the cr-conjugate 
latin square Qi. An additional quasigroup property is often considered, namely 
idempotence. A quasigroup is idempotent iff for every a G S, a* a = a. The con- 
junction of all quasigroup properties just introduced is indicated by the acronym 
COILS, and thus the symbol (i,j,k)-COILS(n) used throughout this paper means 
‘(i,j,k)-Conjugate Orthogonal Idempotent Latin Square’. The existence of (i,j,k)- 
COILSs and of related structures has been intensively studied, and a list of open 
problems appears in |1]. Very many open problems have yielded to the use of 
powerful model generators. A by now well-established nomenclature, introduced 
in Enni, sees the existence or non-existence problems for (2,l,3)-COILS, (3,2,1)- 
COILS, and (3,l,2)-COILS as respectively QGO, QGl, and QG2. The notation 
QGi(n) with i = 0, 1, 2 is used where the order n of the quasigroup has to be 
made explicit. Other quasigroup problems are listed in Tabled] Among the open 



Table 1. Constraints of QGi (i G {3, ..7}) 



code name 


constraint to be satisfied 


QG3 

QG4 

QG5 

QG6 

QG7 


{a * b) * {b * a) = a 
{b * a) * {a * b) = a 
{{b * a) * b) * b = a 
{a * b) * b = a * {a * b) 
{b * a) * b = a * {b * a) 



problems solved by means of model generators let us mention that first solutions, 
for example the QG5(9) were obtained using J. Zhang’s FALCON [T^. For sev- 
eral quasigroups (and several orders) solutions of QG2, QG3, QG4, QG5, QG6, 
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QG7 j9] were obtained by the model generators MGTP, DDPP, FINDER, due 
respectively to M. Fujita et al., M. Stickel, J. Slaney mn]. Similarly, solutions 
were found for several quasigroups (and several orders) of QG2, QG5, and QG6 
p4| with H. Zhang’s SATO [1^. Finally, let us also mention that the model 
generator SEM hy J. Zhang and H. Zhang m has solved new open problems of 
types QG5, QG7, QG8, QG9. 

In this paper we are concerned with quasigroups of type QG2. In 1992 the 
situation was as follows. The existence of quasigroups QG2(n) had been proved 
for any integer n except n = 2,3,4, 6 for which it had been proved that no 
quasigroup QG2 of such orders existed, and n = 10, 12, 14, 15 for which the 
answer was unknown. In 1995 M. Stickel gave a solution for the QG2(12) using 
DDPP, then in 1996 H. Zhang et al. gave a (non-idempotent) solution of a 
QG2(14) and a QG2(15) [14]. In order to completely solve the problem QG2, 
there remained to find an answer for the QG2(10). Solving the QG2(10) had been 
noted as particularly difficult in WM- We developed a specific model generator, 
qgs, which we present in this paper, to solve the QG2(10). qgs allows us to 
conclude that there exists no quasigroup of type QG2 and order 10. qgs is a 
model generator along the lines of existing ones. The essential innovation in qgs 
is to provide an effective resolution strategy for exploring a huge search space 
when there is no solution. The efficiency of qgs resides essentially in two things: 

- a representation of the constraints inherent in the quasigroup and in the or- 
thogonality property, which offers an economical treatment of the elementary 
operations concerned with search space exploration, 

- A drastic reduction of the search space by eliminating redundant isomorphic 
subspaces. 

These two points are explained in detail in SectionsOandObelow. We then give 
the ensuing conclusion as to non-existence. For information, the last section gives 
performance comparisons of qgs vs. EINDER, SATO, and SEM on type-QGl, 
and QG2 quasigroup resolution. 



2 Encoding the Constraints Inherent in the Qnasigroup 
of Type QG2 

Using the notations of the foregoing section, < Q,* > denotes a quasigroup 
on the set S = 0, 1, . . . , n — 1, n being the order of the quasigroup, and * the 
associated binary operation. 

There are essentially two types of constraints that a quasigroup of type QG2 
and order n has to satisfy: 

- the constraints related to the existence of the quasigroup itself, demanding 
that any integer i G S appear just once in every line and column of the 
multiplication table of the quasigroup. 

- the constraints related to the orthogonality property, demanding that for any 
i, j, k,l G S, the two pairs {i * j, i *312 j) and {k * l,k *312 1) be distinct. 
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Two options are available for the treatment of these constraints. They may 
be expressed as propositional clauses. In this case the treatment will consist 
in investigating the existence of a truth value assignment to the propositional 
variables so as to satisfy all clauses. Or, the constraints may be expressed as 
first order predicates. In this case, the treatment will consist in searching for 
an assignment of values to the variables within their respective domains, so as 
to satisfy the predicates. The main existing model generators, such as DDPP, 
FINDER, and SATO, treat the constraints as propositional clauses. This choice 
made by the authors of these model generators, seems to stem from the intention 
to build general-purpose model generators, i.e. ones able to handle all problems 
pertaining to quasigroups and related structures, and beyond that, other finite 
algebra problems if necessary. Indeed, when the constraints of the problem under 
consideration are expressed as propositional clauses, the resolution strategy is 
independent of the original problem. The specificity of the latter is taken account 
of at the level of the translation of the constraints into propositional clauses. If, on 
the other hand, the choice is made to represent the constraints as predicates, then 
efficiency requires that the treatment within the model generator be specifically 
designed for these particular constraints. Every new type of problem then calls 
for a specific model generator. These two ways of handling the constraints entail 
different resolution strategies. For example, regarding quasigroups, propositional 
constraints allow a propositional variable to be associated with the value of a cell 
in the multiplication table. A heuristic will therefore be able to make decisions 
at the level of the value of a cell. By contrast, constraints expressed as predicates 
only allow a variable to be associated with a cell in the multiplication table, not 
with its values. For the reasons just mentioned, namely generality of purpose 
of the model generator’s treatment, and subtlety of the resolution strategy, the 
choice of propositional clauses to model the constraints is justified. However, we 
show in the sequel that the cost of treatment of the propositional clauses for the 
quasigroups of type QG2 is exorbitant. Besides, this problem was clearly raised 
in m in connection with SATO. 



We developed a model generator qgs specifically for the resolution of quasi- 
group problems of type QG2. In qgs the constraints are treated in the classical 
form of GSPs. Thus, to each cell of the multiplication table to be constructed 
for a quasigroup, i.e. to every i,j G S = 0,l,...,n — 1, we associate a variable 
X(i,j) with domain DX(i,j) = S. Similarly, to each cell of the multiplica- 
tion table of the (3,l,2)-conjugate, a variable Y(i,j) is associated with domain 
DY{i,j) = S. Finally, to each pair (u,v) G S^, we associate a variable Z{u,v) 
whose domain is 0, 1. All variables Z(u, v) have the value 0 initially, and Z(u, v) 
takes the value 1 as soon as a pair has appeared in the quasigroup and its conju- 
gate such that X(i,j) = u and Y(i,j) = v. In qgs, resolution treatment consists 
in developing a search tree by assigning the variables X{i,j) admissible values 
within their domains. A value is admissible for a variable X(i,j) if it complies 
with the quasigroup constraint, i.e. if no variable X(i,k) or X(k,j) (for some 
k G S) has already been assigned the same value; and with the orthogonality 
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constraint, i.e., pairing it with the value in the conjugate’s homologous cell does 
not produce a pair already giving the associated variable Z the value 1. 

qgs and existing model generators dealing with propositional constraints ex- 
plore the search space by means of similar elementary operations. Particularly 
two are: 

- assigning a variable a value, with the ensuing propagations; 

- the backtracking process, which consists in reconstructing a previous state of 
the data structures. 

We did a performance comparison of these two types of operations in qgs as 
against SATO, which happened to be the most convenient for such a study. 
Tables |2] and Ogive mean treatment times for each operation, performed 500,000 
times by both SATO and qgs in the course of processing problems from QG2(8) 
to QG2(15) (without, of course, the aim to solve them). 



Table 2. Mean time (on first 500,000 calls to the branch function) to assign a value 
to a cell with qgs and SATO 



Quasigroup 

Problem 


SATO 

(in ^lseconds) ( 


qgs 

in ^iseconds) 


QG2(8) 


1239.5 


10.2 


QG2(9) 


2458.4 


11.1 


QG2(10) 


4863.9 


13.5 


QG2(11) 


8578.9 


15.6 


QG2(12) 


12996.1 


17.9 


QG2(13) 


21447.8 


19.4 


QG2(14) 


34666.9 


22.1 


QG2(15) 


57362.1 


24.2 



Table 3. Mean time (on first 500, 000 calls to the backtrack function) to rebuilt a cell 
with qgs and SATO 



Quasigroup 

Problem 


SATO 

(in /iseconds) ( 


ggs 

n ^tseconds) 


QG2(8) 


139.7 


0.8 


QG2(9) 


319.5 


1.0 


QG2(10) 


685.9 


1.1 


QG2(11) 


1276.9 


1.1 


QG2(12) 


2096.3 


1.1 


QG2(13) 


3244.6 


1.2 


QG2(14) 


4634.6 


1.2 


QG2(15) 


6119.1 


1.3 
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Tables El and O show that mean processing times for SATO are 100 to 1000 
higher than those of qgs. Figure [T] allows the comparative evolution of these 




Fig. 1. Mean time (on first 500, 000 calls to the functions) to compnte the next node 
and rebuilt it with SATO and qgs 



times, as a function of quasigroup order, to be visualized for both generators. 
The curves linking the points corresponding to the values in Tables El and E] only 
serve for visual appreciation, they obviously have no reality since the order of 
a quasigroup is an integer. Such substantial differences in processing time for 
similar operations in SATO and qgs are explained by the size of the structures 
handled by SATO. For an order n QG2 quasigroup, SATO is known to generate 
clauses containing exactly variables. The clauses generated to express the 
orthogonality constraint have 4 literals, and their number is in 0{nP), lending 
itself to reduction to 0{n‘^) [13]. The clauses generated to express the quasigroup 
constraints have 2 literals, and their number is in O(n^). Table |H gives, for 



Table 4. Number of variables, clauses of size 4 and 2 in the cnf formulae generated by 
SATO 



Quasigroup 

Problem 


number of 
4-Clauses 


number of 
2-Clauses 


number of 
propositionnal 
variables 


QG2(6) 


1550 


970 


216 


QG2(7) 


5781 


2002 


343 


QG2(8) 


17775 


3696 


512 


QG2(9) 


46096 


6288 


729 


QG2(10) 


105187 


10050 


1000 


QG2(11) 


217561 


15290 


1331 


QG2(12) 


416366 


22352 


1728 



problems QG2(6) to QG2(12), the exact numbers of generated clauses for both 
the orthogonality and the quasigroup constraints. As for qgs, the number of 
variables is O(n^), the orthogonality constraint translates into a test of the form 
Z(u,v) < 1 for every u,v G S, and the quasigroup constraints translate into a 
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domain update for the variables X(i,j) et Y{i,j), together with a test on domain 
size for X{i,j). 

The question may be raised as to whether the computation time ratios be- 
tween SATO and qgs, as shown in Tables [2] and[3l carry over to their respec- 
tive global resolution times when effectively solving QG2 quasigroups. Table |5] 
gives these resolution times, which do not provide a definite answer. Apart from 



Table 5. Run time of qgs and SATO on some orders of QG2 problems 



Quasigroup 

Problem 


SATO 

(in seconds) 


Qgs 

(in seconds) 


QG2(6) 


O.OOs 


O.OOs 


QG2(7) 


0.02s 


O.OOs 


QG2(8) 


8.72s 


17.15s 


QG2(9) 


319.09s 


0.25s 



QG2(6) which has no solution but for which the processing time is too short, for 
the other QG2s which have a solution, the time depends essentially not on the 
time taken to explore the search space, but of the speed at which the solution is 
found. In the next section, search space exploration times will be compared on 
subproblems of QG2(10), which has no solution. 

To conclude, for solving the problem QG2(10), and thus to possibly explore 
in its entirely the search space for this quasigroup, the computation time ratios 
in the above test make it reasonable to assume that it is more efficient to treat 
QG2 quasigroup constraints in the GSP form rather than in the propositional 
clauses form. 



3 Reducing the Search Space by Eliminating Redundant 
Isomorphic Quasigroups 

To make the resolution of QG2(10) possible, space search reduction is essential. 
In jS], M. Fujita et al. have defined a rule, called Least Number Heuristic {LNH) 
by J. Zhang in m, aimed at eliminating some subspaces isomorphic to ones al- 
ready searched. Recall that two quasigroups Qi =< S,* > and Q2 =< S,o > are 
isomorphic if there exists a permutation g on the n integers of S, such that for all 
i,j G S, g{i*j) = g{i)og{j)- This property of isomorphism between quasigroups 
is crucial for search space reduction. The rule defined by M. Fujita et al. is used 
in the model generators FALCON, MGTP, DDPP, FLNDER, SATO, and SEM 
for solving quasigroups. Applied to the first row of a quasigroup multiplication 
table, this rule permits the search space of idempotent QG2(10) quasigroups to 
be reduced to quasigroups whose first row is one of the 21 listed in the first 
column of Table El labelled LI to L21. However, the search space reduction thus 
attained is still quite insufficient for solving QG2(10). A first remark makes it 
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Table 6. The 21 first rows produced by Inh reduced to 8 rows by isomorphisme 



configurations 
of first lines in QG2(10) 
value of cells: (0,0),(0,1).....(0,9) 


Permutations 
to apply 


Configuration 
obtain after 
permutation 


LI 


0,2 


3, 1,5, 6. 4. 8. 9. 7 


none 


- 


L2 


0,2 


1,4, 3, 6. 5. 8. 9. 7 


none 


- 


L3 


0, 2 


1,4, 3, 6. 7. 8. 9. 5 


none 


- 


L4 


0, 2 


1,4, 5, 3. 7. 8. 9. 6 


none 


- 


L5 


0, 2 


3,4, 1,6, 7. 8. 9. 5 


none 


- 


L6 


0,2 


3, 1,5, 6. 7. 8. 9. 4 


none 


- 


L7 


0,2 


1,4, 5, 6. 7. 8. 9. 3 


none 


- 


L8 


0,2 


3, 4, 5, 6. 7. 8. 9. 1 


none 


- 


L9 


0, 2 


1,4, 3, 6. 7. 5. 9. 8 


0,1, 2. 3. 4, 7, 8, 9, 5, 6 


L2 


LIO 


0, 2 


1,4, 5, 3. 7. 6. 9. 8 


0,1, 2. 7. 8, 9, 3, 4, 5, 6 


L2 


Lll 


0, 2 


3, 1,5, 4. 7. 6. 9. 8 


0,7, 8. 9, 1,2, 3, 4, 5, 6 


L2 


L12 


0,2 


1,4, 5, 6. 7. 3. 9. 8 


0,1.2.5,6,7, 8, 9, 3, 4 


L3 


L13 


0,2 


3, 1,5, 4. 7. 6. 9. 8 


0,5.6,7. 8,9, 1,2,3, 4 


L3 


L14 


0,2 


1,4, 5, 6. 3. 8. 9. 7 


0,1.2.6,7, 8, 9, 3, 4, 5 


L4 


L15 


0, 2 


3, 1,5, 4. 7. 8. 9. 6 


0,3, 4. 5. 1,2, 6, 7, 8, 9 


L4 


L16 


0, 2 


3, 1,5, 6. 7. 4. 9. 8 


0,3, 4. 5. 6, 7, 8, 9, 1,2 


L4 


L17 


0, 2 


3, 4, 1,6, 5. 8. 9. 7 


0,6, 7. 8. 9, 1,2, 3, 4, 5 


L4 


LIS 


0,2 


3, 4, 1,6, 7. 5. 9. 8 


0,6, 7. 8. 9, 3, 4, 5, 1,2 


L4 


L19 


0,2 


3, 4, 5, 1.7. 8. 9. 6 


0,5.6,7. 8,9, 1,2,3, 4 


L5 


L20 


0,2 


3. 4, 5, 6. 1.8. 9. 7 


0,4, 5. 6. 7, 8, 9, 1,2, 3 


L6 


L21 


0, 2 


3, 4, 5, 6. 7. 1.9. 8 


0,3, 4. 5. 6, 7, 8, 9, 1,2 


L7 



possible to further enhance the space search reduction resulting from the rule of 
M. Fujita et al. For each of rows L9 to L21, there exists a permutation of the 
(ordered) integer sequence 0, 1, 2, . . . , 10 into an integer sequence appearing in 
the 2nd column of Table[6l such that, applying it to 0 and to the integers j,k G S 
in the binary operation 0*j = k associated with each of rows L9 to L21, a binary 
operation g{0) * g{j) = 0*g{j) = g{k) is obtained which actually corresponds to 
one of those associated to the first 8 rows, LI to L8. The row labels associated 
by permutation to rows L9 to L21 are listed as the 3rd column. As a result of 
this observation, the search for QG2(10) quasigroups may be restricted to those 
whose first row is among those labelled LI to L8. This shrinks the search space 
by a ratio of nearly 3. In spite of this additional search space reduction, we were 
not able using qgs to complete the processing of QG2(10) for one of the rows 
LI to L8. We therefore sought to enlarge the configuration of a row for which 
isomorphisms between quasigroups could be checked on a computer. Two con- 
figurations were studied: (a) first row and second row denoted RR, (b) first row 
and first column denoted RC. 

(a) Gonfiguration of type RR. 

A computer program was used to enumerate a set of 69,411 configurations of type 
RR, denoted Ci, with i = 1,2, . . . 69411, for an idempotent QG2(10) quasigroup. 
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such that no permutation g exists which, when applied to the binary operation 
associated with an arbitrary RR configuration of an idempotent QG2(10), re- 
turns a binary operation associated with one of the configurations c^. In other 
words, the configurations not belonging to the c^’s are such that there exists a 
permutation g of the integers of S with g(0) = 0 and g(l) = 1 which, when 
applied to the integers j, k,m,p G S of the binary operation defined by 0 * j = fc 
and 1 * m = p associated with each of these configurations, produces a binary 
operation 0 * g{j) = g{k) and 1 * g{m) = g{p) associated to one of the configura- 
tions Ci- A quasigroup with such a configuration corresponding by a permutation 
to one of the configurations Ci, is isomorphic to one of the quasigroups having 
the latter configuration. It may therefore be rejected. To enumerate the config- 
urations Ci, we first selected the first row configurations liable to belonging to 
the configurations Ci, in order to limit the combinatorial enumeration. We found 
17 such first rows. We then enumerated all possible second row configurations 
compatible with one of these 17 first rows. All in all, 69,411 configurations of 
type RR were found to conform to the above conditions. The time required to 
carry out these enumerations was l/i53mn. 

(b) Configuration of type RC. 

A similar enumeration to the above was accomplished, substituting the first 
column for the second row. In this case, the permutations g considered only 
satisfy the relation g(0) = 0, and not g{\) = 1 as previously. It was possible in 
this case to carry out the enumeration directly from the first 8 rows LI, . . . , L8. 
Moreover, the admissible permutations are those which, when applied to Li for 
i = 1, . . . , 8, return Li identically. We call them identity permutations. Table 
| 7 ] gives the number of these unit permutations for each row Li. Then, enumer- 
ating all possible column configurations compatible with each of the rows Li, 
we obtained a total of 16,085 configurations of type RC. The time required to 
carry out these enumerations was 21s. Table | 7 ] gives the detail of configuration 
numbers for the first 8 rows. It may be observed that, as expected, the number of 
retained configurations increases as the number of unit permutations decreases. 



Table 7. Non-isomorphic RC configurations produced 



configurations of 
first lines in QG2(10) 


Number of 
identity 
permutations 


Number of 
non isomorphic 
RC configurations 


LI 


161 


290 


L2 


143 


318 


L3 


39 


1094 


L4 


23 


1820 


L5 


19 


2178 


L6 


17 


2442 


L7 


13 


3100 


L8 


8 


4843 
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The number of RC configurations is seen to be inferior to that of RR config- 
urations by a factor of more than 4. This result was predictable since globally a 
maximum of 8!2! permutations can be applied to the set of RR configurations, 
as against a higher maximum of 9! for the set of RC configurations. 

The choice between the two types RR and RC of configurations for solving 
QG2(10) was determined by the shortest processing time that could be expected 
from either type. Table |8] gives the mean resolution time with qgs on samples 
of 100 randomly drawn configurations. Configurations of type RC are also seen 



Table 8. Mean nodes and mean time to solve 100 randomly drawn configurations 
based on 8 first rows of QG2(10) 



Algorithm 


RC 


RR 


mean #nodes 
(std dev.) 


mean time 
(std dev.) 


mean :?^nodes 
(std dev.) 


mean time 
(std dev.) 


SATO 


12.6 lO" 
(3.2 10®) 


3 days, 19h 
(1 day, 16h) 


- 


> 7 days 


qgs 


22.1 10® 
(7.4 10®) 


11m 38s 
(3m 55s) 


115.4 10® 
(43.5 10®) 


59m 31s 
(22m 21s) 



to have a mean resolution time smaller than those of type RR by a factor of 
5. This a priori hard to predict result may be understood by examining Figure 
|2] For RC configurations, the number of pairs formed by combining values in 
the multiplication table of the quasigroup and that of its conjugate is 10, as 
against 4 for RR configurations. If we assume the orthogonality constraint to be 
paramount in determining resolution time, it is normal that this time should be 
smaller for RC than for RR configurations. 

Finally, the 16,085 RC configurations were solved with qgs on PCs equipped 
with AMD Athlons running at ICHz under a Linux operating system. No 
QC2(10) was found, enabling us to conclude to the non-existence of a QC2(10). 
The total cumulative processing time was 137 days, 4 hours, and 20 minutes. The 
total number of branches of the trees developed by qgs was 387,732,916,219. The 
mean processing time of an RC configuration was 736 seconds, with a standard 
deviation of 256 seconds. The mean number of branches of the search tree for 
an RC configuration was 24,105,000, with a standard deviation of 8,115,000. We 
have attempted, on the other hand, to provide an estimate of SATO’s processing 
time on this problem, to be compared with that of 2^® workdays ^ 30643 years 
given in m- We ran SATO on a sample of 100 RC configurations. The mean 
time was 3 days 19 hours, with a standard deviation of 1 day 16 hours. From this 
mean time, SATO’s total processing time to solve the 16,085 RC configurations 
can be estimated as about 167 years. The time ratio with respect to qgs is seen 
to be about 440, in absolute coherence with those indicated for elementary oper- 
ations (for QG2(10)) in Section|21 Table|5]and TableO Turning now to SATO’s 
habitual working conditions, i.e. without using RC configurations to reduce the 
search space, an estimate of total resolution time for QG2(10) is 9240 years. 
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Configuration of type RC 

Latin square I 1 (3,l,2)-Conjugate 

0123456789 0123456789 




Cell assigned a value 
together with its corresponding 
conjugate cell 

I I Cell assigned a value alone 
I I Cell not assigned a value 
I 0 I Cell assigned "0" 



I Configuration of type RR 

1 (3, 1 ,2)-Conjugate 

0123456789 0123456789 




Fig. 2. Orthogonality on the configurations and “RC" 



which may be viewed as of similar order of magnitude to that of 30,643 years 
given by its author in m- 

4 Further Results 

For the reader’s information, qgs’s performance on quasigroups of type QGl and 
QG2 is compared to that of the main model generators, namely SATO, FINDER, 
SEM, and, as a reference point, the ‘general’ SAT solver posit |2]. As regards qgs, 
we applied the same strategy of elimination of isomorphic subspaces as in the 
resolution of QG2(10). Table 0 lists resolution tree sizes for posit, SATO, and 
qgs in terms of nodes developed. EINDER and SEM do not appear in Table 0 
since they do not provide explicit branch numbers on each of their runs. 

It should be noted that qgs develops trees of significant size, much larger for 
instance than those developed by SATO (by a factor of up to 50). These differ- 
ences have an explanation. Almost all quasigroups in this test have a solution. 
qgs, contrary to SATO, specializes in search space exploration and not in solution 
finding. In terms of computing time, in Table [10 qgs is seen the fastest (beyond 
order 7 we were unable to give computation time for FINDER and SEM which 
are tricky to use, especially in the phase of problem description using first order 
logic). 
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Table 9. Number of nodes developed in the search tree of posit, SATO and qgs for 
QGl and QG2 of order from 6 to 9 



Quasigroup 

Problem 


model 
exist ? 


posit 

in ^nodes 


SATO 

in :?^nodes 


Qgs 

in ^nodes 


QG1(6) 


no 


1 


7 


63 


QG1(7) 


yes 


4 


15 


106 


QG1(8) 


yes 


198 


84313 


18091 


QG1(9) 


yes 


- 


4.51 10® 


229 10® 


QG2(6) 


no 


0 


6 


23 


QG2(7) 


yes 


16 


8 


6 


QG2(8) 


yes 


21 


11173 


15453 


QG2(9) 


yes 


240 10^ 


1.83 10® 


110 10® 



Table 10. Computation time of posit, SEM, FINDER, SATO and qgs to prove exis- 
tence or non-existence of QGl and QG2 of order from 6 to 9 



Quasigroup 

Problem 


model 
exist ? 


posit 

run time 
in seconds 


SEM 

run time 
in seconds 


FINDER 

run time 
in seconds 


SATO 

run time 
in seconds 


Qgs 

run time 
in seconds 


QG1(6) 


no 


0.01s 


0.20s 


0.03s 


0.01s 


0.00s 


QG1(7) 


yes 


0.00s 


4.57s 


0.27s 


0.03s 


0.01s 


QG1(8) 


yes 


0.20s 


- 


- 


107.02s 


0.70s 


QG1(9) 


yes 


> 7 days 


- 


- 


1170.59s 


632.98s 


QG2(6) 


no 


0.00s 


0.63s 


0.03s 


0.00s 


0.00s 


QG2(7) 


yes 


0.00s 


0.72s 


0.23s 


0.02s 


0.01s 


QG2(8) 


yes 


0.04s 


- 


- 


8.72s 


1.75s 


QG2(9) 


yes 


59451.00s 


- 


- 


319.09s 


227.79s 



5 Conclusion 

The last few years have been very rich in respect of quasigroup resolution. Com- 
puter programs developed to this effect have made possible the resolution of 
many open problems about quasigroup existence or non-existence. For QG2, 
solely the order 10 remained open and unattainable by current computer pro- 
grams, irrespective of whether the approach was sequential or parallel. With the 
aim of solving QG2(10), we have presented a new solver, qgs, specialized in the 
resolution of quasigroups of type QG2. qgs achieves a very significant reduction 
of the search space, allied with economical processing. We were thus able to 
disprove the existence of QG2(10) in 140 days of sequential computation. This 
leads to the conclusion that designing specialized model generators can increase 
the likelihood of solving further open problems. 
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Abstract. This paper contains an experimental study of the impact of the con- 
struction strategy of reduced, ordered binary decision diagrams (ROBDDs) on 
the average-case computational complexity of random 3-SAT, using the CUDD 
package. We study the variation of median running times for a large collection of 
random 3-SAT problems as a function of the density as well as the order (number of 
variables) of the instances. We used ROBDD-based pure SAT-solving algorithms, 
which we obtained by an aggressive application of existential quantification, aug- 
mented by several heuristic optimizations. Our main finding is that our algorithms 
display an “easy-hard-less-hard” pattern that is quite similar to that observed ear- 
lier for search-based solvers. When we start with low-density instances and then 
increase the density, we go from a region of polynomial running time, to a region 
of exponential running time, where the exponent first increases and then decreases 
as a function of the density. The locations of both transitions, from polynomial to 
exponential and from increasing to decreasing exponent, are algorithm dependent. 
In particular, the running time peak is quite independent from the crossover density 
of 4.26 (where the probability of satisfiability declines precipitously); it occurs 
at density 3.8 for one algorithm and at density 2.3 for for another, demonstrating 
that the correlation between the crossover density and computational hardness is 
algorithm dependent. 



1 Introduction 

The last decade has seen an intense focus on the complexity of randomly generated 
combinatorial problems. This interest was stimulated by the discovery of a fascinat- 
ing connection between the density of combinatorial problems and their computational 
complexity, see rnrnn . A problem that has received a lot of attention in this area is 
the 3 -satisfiability problem (3-SAT), which is a paradigmatic combinatorial problem, 

* Part of this work was done while this author was on sabbatical at Rice University, funded in 
part by CONACyT grant 145502. 

** Work partially supported by NSF grants IIS-9908435, IIS-9978135, CCR-9988322, and EIA- 
0086264, and by a grant from the Intel Corporation. 
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and also important for its own sake. An instance of 3-SAT consists of a conjunction of 
clauses, each one a disjunction of three literals. The goal is to find a truth assignment 
that satisfies all clauses. The density of a 3-SAT instance is the ratio of the number of 
clauses to the number of Boolean variables (we refer to the latter number as the order 
of the instance). Clearly, a low density suggests that the instance is under-constrained, 
and therefore is likely to be satisfiable, while a high density suggests that the instance 
is over-constrained and is unlikely to be satisfiable. Experimental research II15I31II has 
shown that for ratio below (roughly) 4.26, the probability of satisfiability goes to 1 as 
the order increases, while for ratio above 4.26 the probability goes to 0. At 4.26, the 
probability of satisfiability is 0.5. We call this density the crossover density. Formally 
establishing the crossover density is known to be quite difficult, and is the subject of 
continuing research, cf. I118I17I1II . 

The experiments in fTsmi . which applied algorithms based on the so-called Davis- 
Logemann-Loveland method (abbr., DLL method) (a depth-first search with unit prop- 
agation I16I L also show that the density of a 3-SAT instance is intimately related to its 
computational complexity. Intuitively, it seems that under-constrained instances are easy 
to solve, as a satisfying assignment can be found fast, and over-constrained instances 
are also easy to solve, as all branches of the search terminate quickly. Indeed, the data 
displayed in M15I310 show how the running time increases with increasing density until 
the crossover density and then declines with increasing density, with a marked running- 
time peak essentially at the crossover density. What we see at the crossover density is 
in essence a phase transition, viz., a marked qualitative change in the structural proper- 
ties of the problem. This pattern of behavior with a running-time peak at the crossover 
density is called the easy-hard-easy pattern and is the subject of extensive research, cf. 

m- 

In [fTTI it was pointed out that this picture is quite simplistic for various reasons. 
First, it is not clear where the boundaries between the “easy”, “hard”, and “easy” re- 
gions are. Second, the terms “easy” and “hard” do not carry any rigorous meaning. The 
computational complexity of a problem is typically studied on an infinite collection of 
instances, and is specified as a function of problem size or order. The easy-hard-easy 
pattern, however, is observed when the order is fixed while the density varies, but once 
the order is fixed, there are only finitely many possible instances. For that reason, the- 
oretical analyses of the random 3-SAT problem focus on collections of fixed-density 
instances, rather than on collections of fixed-order instances^] Third, in the context of a 
concrete application, e.g., bounded model checking (4|, it is typically the order that tends 
to grow while the density stays fixed, for example, as we search for longer and longer 
counterexamples in bounded model checking. Thus, the easy-hard-easy pattern tells us 
little about the complexity of 3-SAT in such settings. Until recently, however, there was 
little experimental work that studies how the running time of a SAT solver varies as a 
function of the order for fixed-density instances. Finally, the experiments reported in 
13TTT5H are focused solely on DLL-based algorithms. While these are indeed the most 
popular algorithms for the satisfiability problem, one cannot jump to conclusions about 



' For example, it is known that in the high-density region, above density 5.2, the DLL method is 
provably exponential [ni; see also [3). 
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the inherent and practical complexity of random 3-SAT based solely on experiments 
using these algorithms. 

The goal of the research reported in m3 was to determine how the average-case 
complexity of random 3-SAT, understood as a function of the order for fixed density 
instances, depends on the density for a variety of SAT solvers. Is there a phase transition 
in which the complexity shifts from polynomial to exponential? Is such a transition 
dependent or independent of the solver? To explore these questions, Coarfa et el. O 
set out to obtain a good coverage of an initial quadrangle of the two-dimensional d x n 
quadrant, where d is the density and n is the order, exploring the range 0 < d < 15 
using three different SAT solvers, embodying different underlying algorithms: GRASP, 
which is based on the DLL method jT7|, the CPLEX MIP Solver, which is a commercial 
optimizer for integer-programming problems, and CUDeH, which implements functions 
to manipulate Reduced Ordered Binary Decision Diagrams (ROBDDs), providing an 
efficient representation for Boolean functions [ 7]0 

The findings in lim show that for GRASP and CPLEX the easy-hard-easy pattern 
is better described as an easy-hard-less-hard pattern, where, as is the standard usage 
in computational complexity theory, “easy” means polynomial time and “hard” means 
exponential time. When we start with low-density instances and then increase the density, 
we go from a region of polynomial running time to a region of exponential running 
time, where the exponent first increases and then decreases as a function of the density. 
Thus, one observes at least two phase transitions as the density is increased: a transition 
at about density 3.8 from polynomial to exponential running time and a transition at 
about density 4.26 (the crossover density) from an increasing exponent to a decreasing 
exponent 3 The region between 3.8 and 4.26 is also characterized by the prevalence of 
very hard instances, the so called “heavy-tail phenomenon”, cf. I23I28I30II . 

A very different picture emerged in lfT3ll for CUDD (described in Section |3- Here 
the algorithm is exponential (in both time and space) for densities between 0.5 and 15. 
There is, however, no running-time peak near the crossover density and no heavy-tail 
phenomenon was observed. A peak, however, is observed in the size of the final ROBDDs 
constructed by the algorithm at about density 2, indicating a phase transition at about 
this density. At a very low density (0.1), a polynomial (cubic) behavior is observed, 
which suggests that another phase transition is “lurking” between densities 0.1 and 0.5. 
Thus, unlike earlier predictions (cf. lE6ln . phase-transition phenomena related to random 
3-SAT are not solver independent. 

Our interest in studying ROBDD-based algorithms is motivated by the fact that 
ROBDDs have proven to be very effective in the context of hardware verification Il^l2^ 
and they are very different from standard search-based SAT solving methods. Uribe and 
Shekel ll^ compared ROBDDs with the DLL method for SAT solving, concluding that 
the methods are incomparable, and that ROBDDs dominate the DLL method on many 
examples. Recent work by Groote and Zantema formally proved the incomparability of 

^ http : //bessie . Colorado . edu/~f abio/CUDD 

^ We use ROBDDs to represent Boolean functions. This is different than the usage in IIOII of 
(zero-suppressed) ROBDDs to represent compactly sets of clauses. 

* The polynomial to exponential phase transition, preceding the crossover point, was discovered 
independently by Cocco and Monasson O. 
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ROBDDs and resolution (which is the proof system underlying the DLL method) ll22l . 
The comparison in [IT3l between GRASP and CPLEX, on one hand, and CUDD, on the 
other hand, is, however, somewhat unenlightening. Unlike GRASP and CPLEX, CUDD 
does not search for a single satisfying truth assignment. Rather, it constructs a compact 
symbolic representation of the set of all satisfying truth assignments and then checks 
whether this set is nonempty. (Note, however, that for extremely sparse formula, the 
ROBDD-based algorithm is polynomial in spite of the fact that we have exponentially 
many satisfying truth assignments, due to the compactness of the representation.) In this 
paper we study the behavior of pure ROBDD-based SAT solvers. A pure SAT solver has 
to simply decide for a given propositional formula whether or not it is satisfiable; unlike 
search-based SAT solvers, it need not return a satisfying truth assignment!! The key 
step in constructing an ROBDD-based pure SAT solver is an aggressive application of 
existential quantihcation. (We describe the algorithm later on.) Once we have the basic 
algorithm, we can apply several heuristic optimizations, resulting in rather dramatic 
improvement in running time. 

Our aim, however, is not to directly compare the performance of the different algo- 
rithms in order to see which one has the “best” performance, but rather to understand 
their behavior in the d x n quadrant in order to make qualitative observations on how 
the complexity of random 3-SAT is viewed from different algorithmic perspectives. It is 
important to note that the algorithms we used do not explicitly refer to the density of the 
input instances. Thus, a qualitative change in the behavior of the algorithm, as a result 
of changing the density, indicates a genuine structural change in the SAT instances from 
the perspective of the algorithm. 

Our main Ending is that the optimized ROBDD-based pure SAT-solving algorithms 
display easy-hard-less-hard pattern that is quite similar to that observed for GRASP and 
CPLEX in HL31 . When we start with low-density instances and then increase the density, 
we go from a region of polynomial running time, to a region of exponential running 
time, where the exponent hrst increases and then decreases as a function of the density. 
Thus, one again observes at least two phase transitions as the density is increased: a 
transition from polynomial to exponential running time, accompanied by a heavy-tail 
phenomenon, and a transition from an increasing exponent to a decreasing exponent. 
Surprisingly, however, the location of both phase transitions is algorithm dependent. 
Unlike what has been observed so far in numerous papers, the transition from increasing 
to decreasing exponent, which corresponds to the running-time peak as one increase the 
density for a hxed order, does not occur at the crossover density of density 4.26. Eor one 
algorithm this transition occurs at density 3.8 and for the other at density 2.3. 

Our findings provide further experimental evidence for the following two hypotheses. 
Eirst, the running-time peak can change with the choice of solver not only in a minor 
way, as noted in [|28l . but in quite a major way, moving quite dramatically from the 
crossover density. This demonstrates that the correlation between the crossover density 
and computational hardness is algorithm-dependent, challenging the widely-held belief 
that the “hard problems” are always located at the crossover density o. Second, as 

^ Note, however, that by successively assigning truth values to the variables we can use a pure 
SAT solver to find a satisfying truth assignment, increasing the running time only by a linear 
multiplicative factor. This means that SAT enjoyes self-reducibility m 
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observed in fT3ll . the density-order quadrant contains several phase transitions; in fact, 
the region between density 0 and density 4.26 seems to be rife with phase transitions, 
which are also solver dependent. In essence, each solver provides us with a different tool 
with which to study the complexity of random 3-SAT. This is analogous to astronomers 
observing the sky using telescopes that operate at different wave lengths. While our 
results are purely empirical, as the lack of success with formally proving a sharp thresh- 
old at the crossover density indicates (cf. II18I17I1I ), providing rigorous proof for our 
qualitative observations may be a rather difficult task. 

2 Experimental Setup 

Our experimental setup is identical to that of II15I31I13I1 . We generate dn clauses, each 
by picking three distinct variables at random and choosing their polarity uniformly. For 
each studied point in the dx n quadrant we generate at least 100 random instances and 
apply a solver. Our experiments were run on Sun Ultra 1 machines, with a 167MHZ 
UltraSPARC processor and 256MB RAM. The CUDD package has been used through 
the GLU C-interface |[34l , a set of low-level utilities to access BDD packages. It is well 
known that the size of the ROBDD for a given function depends on the variable order 
chosen for that function. We have used automatic dynamic reordering during the tests 
with the default method for automatic reordering of CUDD (except in Sectional where 
we used a certain fixed order). 

As in OTH . we chose to focus on median running time rather than mean running time. 
The difficulty of completing the runs on very hard instances makes it less practical to 
measure the mean. Furthermore, the median and the mean are typically quite close to 
each other, except for the regions that display heavy-tail phenomena, where the median 
and the mean diverge dramatically 112013011311 . It would be interesting to analyze our data 
at percentiles other than the 50th percentile (the median) (cf. IPDI ). though a meaningful 
analysis for high percentiles would require many more sample points than we have in 
our experiments. 

For the statistical analysis and plotting of data, we used MATLAB@, which is an inte- 
grated technical computing environment that combines numeric computation, advanced 
graphics and visualization, and a high-level programming language. The MATLAB 
functions we used for statistical analysis were: 

- polyfit, for computing the best fit to a set of data using polynomial regression, and 

- corrcoef, for computing , the square of correlation (r ^ is the fraction of the variance 
of one variable that is explained by regression on the other variable). 

For all the results reported in this paper, exceeded 0.98. This establishes high confi- 
dence in the validity of the ht of the curve to the data points. 

3 Random 3-SAT and CUDD 

In this section we review the results of 111 3 1 regarding Random 3-SAT and CUDD. CUDD 
m is a package that provides functions for the manipulation of Boolean functions, based 

® http : //www.mathworks . com 
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on the reduced, ordered, binary decision diagram (ROBDD) representation Q. A binary 
decision diagram (BDD) is a rooted directed acyclic graph that has only two terminal 
nodes labeled 0 and 1. Every non-terminal node is labeled with a Boolean variable and 
has two outgoing edges labeled 0 and 1 . An ordered binary decision diagram (OBDD) 
is a BDD with the constraint that the input variables are ordered and every path in the 
OBDD visits the variables in ascending order. An ROBDD is an OBDD where every 
node represents a distinct logic function. The support set of an ROBDD is the set of 
variables labeling its internal nodes. 

CUDD constructs a compact representation of the set of satisfying truth assignments. 
The input formula (/? is a conjunction Ci A . . . A Cm of 3-clauses, where m = dn. Our 
algorithm constructs an ROBDD Ai for each clause Ci. (Note that Ai has to represent 
only the seven satisfying truth assignments of q.) An ROBDD for the set of satisfying 
truth assignment is then constructed incrementally; Bi is Ai, while i?i+i is the result 
of APPLY(i?i, Aj, a), where apply(A, B, o) is the result of applying a Boolean operator 
o to two ROBDDs A and B. Finally, the resulting ROBDD Bm is compared against the 
predefined constant 0 (the empty ROBDD) in order to find if an instance is (un)satisfiable. 
We call this the BDD algorithm. 

The goal of the experiments was to evaluate CUDD’s performance on an initial 
quadrangle of the d x n quadrant. Densities 0.1, 0.5, and 1 to 15 were explored in 
m. In Figure 1 the median running time is shown on a logarithmic (base 2) scale. 
Note the absence of a peak; the running-time curve flattens roughly at density 2. The 
explanation for the lack of running-time peak is that the running time of ROBDD-based 
algorithms is determined mostly by the size of the manipulated ROBDDs. Our algorithm 
involves m = dn conjunction operations between the possibly large ROBDD Bi and 
the small ROBDD Ai. Thus, the running time of our algorithm is determined by the 
largest intermediate ROBDD Bi constructed. As shown in fT3ll , the peak in ROBDD 
size is attained after processing about 2n clauses, which explains the flattening of the 
running-time plot at density 2, and suggests that a phase transition in terms of ROBDD 
size occurs at about this density. 

The median running time was analyzed as a function of the order for fixed-density 
instances. At densities 0.5 and above, the median running time of CUDD is exponential 
in the order, i.e., it behaves as 2“". In contrast, at density O.I the running time is cubic. 
This is explained by the fact that ROBDDs can represent very large sets quite compactly, 
which is why the method is quite effective for very low-densities instances, where the 
number of satisfying truth assignments is very large. Unlike what is observed for search- 
based algorithms, the BDD algorithms does not exhibit a heavy-tail phenomenon. 

4 Existential Quantification of Variables 

CUDD enables us to apply existential quantification to an ROBDD B: 

(3x)B = apply(B|^^^,B|^^(,,V), 

where B\^^^ restricts B to truth assignments that assign the value c to the variable x. 
Note that quantifying x existentially eliminates it from the support set of B. We now see 
how we can take advantage of existential quantification. 
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TTic satisfiability problem is to (lelermine wbethcr a given formula r,-[ A . . . A is 
satisfiable. In other words, the problem is to determine whether the existential formula 
(dzi) . . . (di„)(ci A ... A Cto) is true. Since checking whether the final ROBDD B„i 
is equal to 0 ean be done by CIJDD in constant lime, it makes lillJe sense, however, to 
apply existential qutmtification to Dm- Suppose, however, that a variable Xj does not 
occur in the clauses c,+i , ... ,Cm- 'ITien the existential formula can be rewritten as 

(dzi) . . . (dZj_i)(dZj.,.i) . . . (di„)((dzj)(ci A ... A Ci) A (cf+i A ... A Cm)). 

This means that after constructing the ROBDD Di, we c,an existentially qiuantify Xj 
before conjuncting Bi with . . . , Am- 

Tliis suggests tlie following modification of our algoritlim; after constructing tlie 
ROBDD Bj, quantify exi.slentially variables lhatdo not occurinlheelausescj 1 1 , . . . , fv„- 
In this case we say that the variable x has been quantified out. The computational 
advantage of quantifying out stems from tlie fact tliat reducing tlie size of tlie support set 
of an ROBDD typically (though not nece.ssarily) resulls in a reduction of ils size; that 
is, the size of (Ilx)B is typically smaller than that of D. This method is call the early 
quantification method, and proposed first in the context of symbolic model checking 
[8]. Early quantification was applied to SAT solving in [21] (under tlie name of hiding 
fimetions) and tried on random 3-SAT instances, but without a system,atic study of 
file complexity of random 3-SAT. Our implementation adds the slight improvement of 
stopping tlie constioiction as soon as we construct a Bi tliat is equal to 0; tliis is called early 
termination. We will call this algorithm, i.e., early quantification with early tennination, 
BDD(Q). 

Figure 2 (left) shows tlie median lumiing time of BDD(Q) on a logai itlimic (base 2) 
scale. Tlie inalian running time has decreasal with respect to the BDD algorithm. At 
order 46, for densities less th.an or equal to two we got an order of magnitu de improvement 
(lOX) in running time. For greater densities, the improvement is only between 5% to 
1.5%. 
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Fig.2. BDD(Q) - (left) 3-D Plot of uiediau ruiiidug tiuie, and (riglit) luediau ruuiiiiig time as a 
function of the density for order 46 



The overall shape of Lhe running-lime surface is somewhat similar to lliat ohserval 
in Section 3; tlie running time increases with density and then seems to flatten, lhe 
flattening, however, occurs at about density 4, ratlier tlian density 2. Note tliat once we 
haveprocessali = 4. ,3n clauses, tlieconjunctionci A. . .ACj iswith very high probability 
unsatisflable, which means that is with high probability equal to 0. Thus, BDD(Q) 
typically terminates by tlie time 5n clauses have been processed, wliich explains tlie 
flattening of flic run-time surface for densities over 5. In Figure 2 (right) median running 
times are shown as a function of the density, for order 46. 

An interesting difference between the BDD and BDD(Q) algorithms is that the tran- 
sition from polynomial to exponential has shifted to the right Our results indicate a 
quadratic-time behavior at density 0.5 — see Figure 3 (left) — wliile at densities 1 and 
above the malian running time is exponential in the order, see Figure 3 (right) for Tiie- 
dian running times for instances of density 1, on a logarithmic (base 2) scale. It should 
also be noted tliat BDD(Q) also does not exliibit a heavy-tail phenomenon. 



5 Reordering the Clauses 

BDD(O) processes tlie clauses of tlie input formula in a linear fasliion. Since tlie main 
point of early quantification is to quantify variables out as early as possible, reordering 
the clauses may enable us to do more aggressive early quantification. Ihat is, instead of 
processing tlie clauses in tlie order ci, . . . , Cm, we can apply a permutation tt and process 
lhe clauses in the order c,r(i)) • ■ • i <Tr(m)- pennutation tt should be chosen so as to 
minimize the number of variables in the support sets of the intermediates ROBDDs. 
This observation was first made in the context of symbolic model checking, cf. 18,19,24, 
.51. Unfortunately, finding an optimal penimlalion tt is by itself a difficult optimization 
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Fig. 3. BDD(Q) - (left) median running time for density 0.5 as a function of the order of the 
instances; a quadratic function fits these points better than an exponential function, and (right) 
median running time for density 1 (log scale) 



problem, motivating a greedy approach: searching at each step for the clause that would 
result in the maximum number of variables to be quantified out. 

Our proposed algorithm searches for a clause with the maximum number of variables 
with only one occurrence in the remaining clauses. If more than one clause is a possible 
candidate then a second criterion is applied; from the candidate clauses, the algorithm 
looks for one that shares least variables with the remaining clauses. (This is as opposed 
to HSJ, where the algorithm looks for a candidate that shares most variables with the 
remaining clauses. We have tried this latter heuristic, and the results are not as good as 
using our heuristic.) The rationale of our heuristic is trying to quantify out variables as 
soon as possible. We will call this algorithm BDD(Q,R). 

Figure 4 shows median running time using our algorithm. The median running time 
has decreased quite dramatically with respect to the BDD algorithm. The improvements 
are most dramatic at low and high densities. For example, for order 46, for density 1 
we get a 30X improvement (i.e., the running time of BDD(Q,R) is about 0.03 times 
that of that of BDD) and for densities 9 and above we get a lOOX improvement, while 
for density 4 we get a 6X improvement. Most interestingly, the shape of the running- 
time surface is now similar to the shape of the running-time surface for search-based 
algorithms (GRASP and CPLEX) in JTS]. 

Unlike what we saw in m, where the running-time peak roughly occurs at the 
crossover density, running-time peak for BDD(Q,R) seems to occur at about density 3.8. 
In Figure 5, we plot the median running time in the “hard” zone, for 40 and 46 variables, 
respectively, with 1000 experiments per point. It is interesting to note that density 3.8 
is where the transition from polynomial to exponential running time for search-based 
solvers was observed in fl3]| . 

Another interesting development is a further shift to the right of the transition from 
polynomial to exponential median running time. At density 1 our data indicate a quadratic 
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Fig. 4. BDD(Q,R) - 3-D Plot of median rumiiug lime 




Fig. 5. BDD(Q,R) - median running time in the hard region, for order 40 (left) and 46 (right) 



running l.iine. See Figure 6 (left.) for malian running LimevS for instanees of density I , with 
200 instances per point. For densities 1.5 and above the ninning time is exponential. See 
Figure 6 (right) for median running times for instances of density 1.5 on a logarithmic 
(base 2) scale. Tlius, tlie transition occurs between densities 1 and 1.5. Recall tliat, in 
contrast, the transition for the BDD algorithm occurs between densities 0.1 and 0.5, 
while for BDD(Q) it occurs between densities 0.5 and 1. Thus, the improvement in the 
algoritlim is not merely quantitative, it is also qualitative, as it expands tire region in 
which the algorithm is feasible. 

As with GRASP and CPLHX 113J, the transition from polynomial to exponential 
behavior of BDD(Q,R) is accompanied by a “heavy-tail phenomenon”, which is apreva- 
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Fig. 6. BDD(Q,R) - (left) median running time for density 1 as a function of the order of the 
instances; a quadratic function fits these points better than an exponential function, and (right) 
median running time for density 1.5 (log scale) 



lence of outliers, i.e., instances on which the actual running time is at least an order of 
magnitude (lOX) larger than the median running time, as well as a divergence of the 
mean and the median. See Figure [7J where we plot the mean to median ratio and the 
proportion of outliers as a function of the density. Thus, in spite of the incomparability 
of search-based solvers and ROBDD-based solvers H35I22II . we see a significant simi- 
larity between the qualitative results in fT3ll and here. For both GRASP, CPLEX, and 
BDD(Q,R). For low densities, the algorithms are polynomial. As the density increases, 
we see a transition from polynomial to exponential behavior, accompanied by a heavy- 
tail phenomenon. As the density increases further, the exponent first increases and then 
decreases. BDD(Q,R) differs, however, in the location of the running-time peak, which 
is roughly at the crossover density for GRASP and CPLEX, and markedly to its left for 
BDD(Q,R). 

A further improvement of early quantification and reordering was proposed in the 
context of symbolic model checking in |23- In this approach, the clauses are not pro- 
cessed one at a time, but several clauses are first clustered together without being pro- 
cessed. Once the size (number of clauses) of a cluster C attains a pre-established bound, 
then we first apply conjunction to all the ROBDDs of the clauses in the C to obtain an 
ROBDD Be and we then combine Be with the ROBDD Bi (which corresponds to all 
the clauses processed earlier) and apply early quantification. Obviously, setting higher 
limits in the cluster size leads to fewer clusters, but a larger cluster C results in a larger 
OBDD Be - To quote 1129 1 : “as the size of the clusters is raised, the number of iterations 
is reduced, while the BDD sizes of the formula increase. In the beginning, the reduction 
in the number of iterations offsets the increase in BDD sizes. Hence initially, runtime 
is reduced as the cluster size increases. But later, the BDD computation time starts to 
dominate and the running time increases”. 
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Fig. 7. BDD(Q,R) - ratio of mean to median running time and proportion of outliers 



We implemented clustering on top of BDD(Q,R) (that is, we order the clauses as in 
BDD(Q,R) before clustering). We will call this algorithm BDD(Q,R,C). Experimentation 
showed that the best results are obtained when cluster size is set to the “magic number” 
20. We found out that BDD(Q,R,C) performs badly at low densities, but yields an im- 
provement of 10%-30% for densities above 3. The qualitative behavior of BDD(Q,R,C) 
is, however, quite similar to that of BDD(Q,R): we observe a transition from polynomial 
to exponential, accompanied by a heavy-tail phenomenon, between densities 1.0 and 
1.5, and the exponent then rises and declines, peaking at about density 3.8. 

6 Variable Ordering 

The previous ROBDD-based methods focused on the processing of the input clauses, 
while at the same time letting CUDD handle the critical issue of variable ordering (includ- 
ing dynamic reordering). Inspired by work of Bouquet (51, we studied an ROBDD-based 
algorithm using variable ordering based on a graph representation of the input formula. 
As we shall see, by using knowledge about the structure of the input formula, we can 
obtain dramatic improvement in running time. 

The graph associated with a CNF formula = f\^Ci is = {V, E), where V is 
the set of variables in (p and an edge {xi, xj} is in E if there exists a clause Ck such that 
Xi and Xj occur in c^. To extract variable order from G,^. Bouquet uses the “maximum 
cardinality search” (MCS) of 1^ . Let n be the number of vertices of G^p. MCS numbers 
the vertices from 1 to n in the following way: As the next vertex to number, select the 
vertex adjacent to the largest number of previously numbered vertices, breaking ties 
arbitrarily. It is this variable ordering that we now provide to CUDD (turning off dynamic 
reordering). 

Bouquet then uses the variable order to cluster the clauses. Let the rank of a clause 
c = {^ 1 , h, ^ 3 } be rank{c) = max (order{xi), order{x 2 ), order{x^)), where Xi is the 
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variable of Lhe literal li. TTic clusters are the equivalent classes of the relalion ~ definal 
by: c ~ iff rank{c) = rank{d). For each cluster Cj = {cj, , . . . we then 
construct an ROBDD Ac^ by apply mg conjunction to tire ROBDDs Aj^ ,Aj^. The 
rank of a cluster is the rank of its clauses (by definition, all tbe clauses in a cluster have 
the same rank). 

In [6], die final ROBDD is constructed by applying conjunction to tire ROBDDs 
Ac^ ’s of die clusters. We have combined Bouquet’s inelhod with tbe method of early 
quantification. We process the clusters in ascending rank order and quantify variables 
out as early as possible. We observed tlrat early quantification plays an important role 
in tbe low densities, where satisfying truth assignments abound. We denote the com- 
bined method by BDD(B,Q,C). For densities 2 or above, BDD(B,Q,C) is significantly 
faster than BDD(Q,R,C). At order 46 we saw improvement between 5X and lOX (for 
lower densities BDD(B,Q,C) is about 30% slower). More interestingly, tire shape of die 
ninning-time surface is quite different for BDD(B,Q,C). Figure 8 shows the median run- 
ning time of BDD(B,Q,C) on a logarithmic (base 2) scale. As we can see, the interesting 
region has moved to tlie left. Tlie rumiing-dme peak now seems to occur at about density 
2.3. Figure 8 shows median running times for order 60. 




Klg.8. UI)D(d,Q,(0 - (left) 3-1) Plot of median running time, and (right) median running times 
as a function of the density for order 60 



We again see a transition from polynomial to exponetitial behavior before die 
running-time peak, between den.sit.ies 0.2 and 1 (to die left of die analogous transi- 
tion for BDD(Q,R)). For very low densities (0.2 or below) our data indicate a cubic 
ruiuiing time. See Figure 9 (left) for median rumiing times for instances of density 0.2. 
For densities 1 and above die malian running time of BDD(B,Q,C) is exponential (but 
see remark below). See Figure 9 (right) for median running time for instances of density 
1 on a logarithmic (base 2) scale. ITie transition from polynomial to exponential behavior 
is again aceompanial by a heavy-tail phenomenon. Tlie pattern of that phenomenon is 
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significantly more complex than that observed for BDD(Q,R) and we have not yet been 
able to characterize it. 



Median running time of BDD(B,Q,C) (secs) for density 0.2 Median running time of BDD(B,Q,C) (secs) for density 1, log 2 





Fig. 9. BDD(B,Q,C) - (left) median running time for density 0.2 as a function of the order of the 
instances; a quadratic function fits these points better than an exponential function, and (right) 
median running time for density 1 (log scale) 



Remark: Note that the running time decreased quite dramatically with Increasing den- 
sities above 2.3. Is it possible that at high enough density we see again polynomial 
behavior? Our data is inconclusive. For example, at density 20 our data fit cubic and 
exponential curves almost equally well. This issue requires further investigation. 



7 Discussion 

In this paper we studied the complexity of random 3-S AT experimentally using ROBBD- 
based pure SAT solvers. Our main finding is that these solvers display easy-hard-less- 
hard pattern that is quite similar to that observed for search-based solvers in m.When 
we start with low-density instances and then increase the density, we go from a region of 
polynomial running time, to a region of exponential running time, where the exponent 
first increases and then decreases as a function of the density. The location of both 
transitions, from polynomial to exponential and from increasing to decreasing exponent, 
are algorithm dependent. In particular, the running time peak is quite independent than 
the crossover density, challenging the widely-held belief that the “hard problems” are 
always located near the crossover density El. 

These findings should be contrasted with those of Ga , which revealed a marked dif- 
ference between solvers like GRASP and CPLEX, which are search based and display 
interesting similarities in the shapes of the median running time surface despite their 
different underlying algorithmic techniques, and ROBDD-based solvers, like CUDD, 
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which are based on compactly representing all satisfying truth assignments. By devel- 
oping here ROBDD-based pure SAT solvers, we showed that certain qualitative features 
of the complexity of random 3-SAT do seem to be algorithm independent. Explaining 
these common features is a challenging research problem. 
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Abstract. We present Regular-SAT, an extension of Boolean Satisfia- 
bility based on a class of many-valued CNF formulas. Regular-SAT shares 
many properties with Boolean SAT, which allows us to generalize some 
of the best known SAT results and apply them to Regular-SAT. In addi- 
tion, Regular-SAT has a number of advantages over Boolean SAT. Most 
importantly, it produces more compact encodings that capture problem 
structure more naturally. Furthermore, its simplicity allows us to develop 
Regular-SAT solvers that are competitive with SAT and CSP procedures. 
We present a detailed performance analysis of Regular-SAT on several 
benchmark domains. These results show a clear computational advan- 
tage of using a Regular-SAT approach over a pure Boolean SAT or CSP 
approach, at least on the domains under consideration. We therefore 
believe that an approach based on Regular-SAT provides a compelling 
intermediate approach between SAT and CSPs, bringing together some 
of the best features of each paradigm. 



1 Introduction 

In the last few years, the tremendous advance in the state of the art of SAT 
solvers, combined with progress in hardware design, has led to the development 
of very fast SAT solvers. As a consequence, SAT encodings have become com- 
petitive with specialized CSP encodings in several domains. However, there is a 
tradeoff between using a uniform encoding, such as SAT, and a more structured 
encoding, as found in the CSP paradigm. 

In general, CSP-based encodings capture problem structure in a more natural 
way than SAT encodings. CSP encodings therefore allow in principle for highly 
efficient solution strategies that exploit inherent problem structure. However, in 
order to take full advantage of the CSP approach, the user may be required to 
develop specialized propagation and search techniques that may be difficult to 
implement efficiently. In a SAT formulation, some of the intricate problem struc- 
ture may be lost, but the availability of highly optimized general SAT solvers 
can often compensate for not directly exploiting inherent problem structure. 

Our goal is to provide an encoding paradigm that is sufficiently uniform, so 
that we can develop general solvers, and at the same time allows us to recover 
the problem structure in a more straightforward manner. Our approach is based 

T. Walsh (Ed.): CP 2001, LNCS 2239, pp. 137- 11^ 2001. 
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on using so-called Regular-SAT encodings. Regular-SAT retains the uniformity 
and simplicity of Boolean SAT but in addition captures problem structure in a 
more straightforward manner. Regular-SAT is an extension of Boolean Satisfia- 
bility based on a special class of many-valued CNF formulas, called regular CNF 
formulas [miH]. These clausal forms have their origin in the many- valued logic 
community |3] and are similar to Boolean CNF formulas, except that they use 
a generalized notion of literal. A literal now is an expression of the form S : p, 
where p is a, propositional variable and S' is a subset of truth values having a 
particular structure. 

Although more general than SAT, Regular-SAT has many properties in com- 
mon with traditional Boolean SAT. For example, we have tractable cases such 
as Regular 2-SAT fTH] and Regular Horn-SAT [15], and there exist well-defined 
phase transitions boundaries in random formula ensembles . 

Our results show that we can solve certain combinatorial problems more effi- 
ciently by using Regular-SAT encodings, compared to approaches based on state- 
of-the-art SAT or CSP approaches. We present results for both local search and 
systematic search. Moreover, we show that the Regular-SAT encodings nicely 
preserve certain structural properties of the original problem domain. In par- 
ticular, we consider the so-called backbone structure of the problem domain. 
Phase-transition properties of backbone structure are properly preserved in a 
Regular-SAT encoding; in a Boolean SAT approach, on the other hand, the 
phase transition structure in the backbone is not directly recoverable. 

The paper is structured as follows. We begin by formally defining the satis- 
fiability problem of Regular CNF formulas (Section 2). In the next section, we 
describe Regular-DP and Regular- WalkS AT, which are generalizations of the so 
called Davis-Putnam procedure (though it is actually due to Davis, Logemann 
and Loveland 0) and WalkSAT [22]. In Section 4, we present a detailed evalu- 
ation of the performance of Regular-SAT procedures. In Section 5, we compare 
Boolean SAT and Regular SAT w.r.t. capturing problem structure. Section 6 
gives overall conclusions. 



2 The SAT Problem of Regular CNF Formulas 

Regular-SAT is the problem of deciding the satisfiability of regular CNF formu- 
las. A regular CNF formula is a classical propositional conjunctive clause form 
based on a generalized notion of literal, called regular literal. Given a truth value 
set T (|r| > 2) equipped with a total ordering <, a regular literal is an expres- 
sion of the form S : p, where p is a propositional variable and S' is a subset of T 
which is either of the form t * = {i G T | j > i} or of the form [i = {j & T\j<i\ 
for some i £ T. The informal meaning of S : p is “p is constrained to the values 
in S” , and one can consider the language of regular CNF formulas as a constraint 
programming language between SAT and CSP. 

Definition 1. A truth value set is a non-empty set T ={zi,* 2 )--- jin}, 
equipped with a total ordering <. A sign is a set S CT of truth values. For 
eaeh element i of the truth value set T, let | i denote the sign {j £ T \ j > i}, 
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and let [i denote the sign {j & T \ j < i}. A sign S is regular if it is identieal 
<0 t * or to [i for some i £T. 



Definition 2. A regular literal is an expression of the form S : p, where S is 
a regular sign and p is a propositional variable. The eomplementary literal of 
S : p is {T \ S) : p. A regular literal S : p is of positive ( negative^ polarity if S 
is of the form | * (li) for some i gT. A regular clause is a finite set of regular 
literals. A regular CNF formula is a finite set of regular clauses. 



Example 1. Let T be the set {0, 1, 2} with the standard order on natural num- 
bers. An example of regular CNF formula is 

(iO : Pi V i 1 : P2 V T2 : Pa) A (t 1 : Pi V |0 : P 2 ). 



Definition 3. An interpretation is a mapping that assigns to every proposi- 
tional variable an element of the truth value set. An interpretation I satisfies 
a regular literal S :p iff I (p) G S. An interpretation satisfies a regular clause 
iff it satisfies at least one of its regular literals. A regular CNF formula E is 
satisfiable iff there exists at least one interpretation that satisfies all the regular 
clauses in E. A regular CNF formula that is not satisfiable is unsatisfiable . The 
empty regular clause, denoted by U, is always unsatisfiable and the empty regular 
CNF formula is always satisfiable. 

Regular-SAT has advantages over SAT, as well as interesting computational 
properties: 

— SAT is a special case of Regular-SAT : any SAT instance can be transformed 
into a logically equivalent Regular-SAT instance of the same size by taking 
T — {0, 1} and replacing every literal p (^p) with 1 1 : P (i 0 : p)- 

— Regular CNF formulas are a more expressive representation formalism than 
classical CNF formulas, and give rise to more compact encodings (less 
clauses, variables, etc.) for many combinatorial problems. 

— Classical proof methods like resolution, and satisfiability algorithms like 
Davis-Putnam, GSAT and WalkSAT can be generalized to deal with reg- 
ular CNF formulas in a natural way. As we will see, the good properties of 
the classical algorithms remain in regular algorithms, and one does not have 
to start from scratch when designing algorithms and heuristics. 

— Regular-SAT, like SAT, is one of the syntactically and conceptually simplest 
NP-complete problems. The design, implementation and analysis of algo- 
rithms for Regular-SAT tend to be easier than for other CSP algorithms. 

— Using regular signs instead of arbitrary subset of truth values as signs has 
clear advantages. For instance, 2-SAT is solvable in polynomial-time when 
signs are regular while it is NP-complete for arbitrary signs [18]. Horn CNF 
formulas admit a natural generalization because regular signs have polarity 
and Regular Horn-SAT is solvable in polynomial-time. 
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3 Regular-SAT Algorithms 

In this section we first describe Regular-DP and then Regular- WalkS AT, which 
are generalizations of the Davis-Putnam procedure and WalkSAT. Regular-DP 
is based on the following rules: 

Regular one-literal rule: given a regular CNF formula F containing a regular 
unit clause {S:p}, 

1. remove all clauses containing a literal subsumed by {S' :p}; i.e., all clauses 
containing a literal S':p such that S C S'; 

2. delete all occurrences of literals S” :p such that S C S" = 0. 

Regular branching rule: reduce the problem of determining whether a regu- 
lar CNF formula F is satisfiable to the problem of determining whether 
F U {S :p} is satisfiable or A U |(T \ S) :p} is satisfiable, where S:p is a reg- 
ular literal occurring in F and the regular literal (T \ S) :p is its complement. 

The pseudo-code of Regular-DP is shown in Figure [TJ It returns true (false) 
if the input regular CNF formula F is satisfiable (unsatisfiable) . First, it applies 
repeatedly the regular one- literal rule and derives a simplified formula F' . Once 
the formula cannot be further simplified, it selects a regular literal S :p of F', 
applies the branching rule and solves recursively the problem of deciding whether 
F' U {S :p} is satisfiable or F' U |(T \S):p} is satisfiable. In the pseudo-code, 
Fs:p denotes the formula obtained after applying the regular one-literal rule to 
a regular CNF formula F using the regular unit clause {<S':p}. 

Observe that the Davis-Putnam procedure is a particular case of Regular- 
DP. Our implementation of Regular-DP incorporates two branching heuristics 
which are extensions of the two-sided Jeroslow-Wang rule mm- Given a regular 
CNF formula F, such heuristics select a regular literal L occurring in F that 



procedure Regular-DP 
Input: a regular CNF formula F 

Output: true if F is satisfiable and false if F is unsatisfiable 
begin 

if C = 0 then return true; 
if □ £ F then return false; 

/* regular one-literal rule*/ 

if F contains a unit clause {Shp} then Regular-DP(Fg/,p); 
let 5 : p be a regular literal occurring in F ; 

/* regular branching rule * / 
if Regular-DP(Fs;p) then return true; 
else return Regular-DP(F(y\g).p); 
end 



Fig. 1. The Regular-DP procedure 
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maximizes J{L) + J{L), where J{L) can be defined either as in Equation 1 or 
as in Equation 2: 



J(L) = 2-K^I (1) J(J-) 

3L' : L' C L 

L' ec e r 



E 

3L' : L' C L 
L' GC G r 




1 ^ 1 - 1^1 \ 
2(|T|-1) j 



(2) 



where L denotes the complement of literal L, L' C L denotes that literal L' 
subsumes literal L, \C\ denotes the number of literals in clause C, and [S'! the 
number of truth values in sign S. 

Equation 1 assigns a larger value to those regular literals L subsumed by 
regular literals L' that appear in many small clauses. This way, when Regular-DP 
branches on L, the probability of deriving new regular unit clauses is larger. 
Equation 2, that was used in our experiments, takes into account the length of 
regular signs as well. This fact is important because regular literals with small 
signs have a larger probability of being eliminated during the application of the 
regular one-literal rule. Observe that in the case that \T\ = 2 we get the same 
equation. 

Regular- Walks AT, whose pseudo-code is shown in Figure [U tries to find 
a satisfying interpretation for a regular CNF formula F performing a greedily 
biased walk through the space of possible interpretations. It starts with a ran- 
domly generated interpretation I. If I does not satisfy F, it proceeds as follows: 
(i) it randomly chooses an unsatisfied clause C, (ii) it chooses — using function 
select- WalkSAT — a variable-value pair {p', k') from the set S of pairs (p, k) 
such that C is satisfied by the current interpretation / if the truth value that 
/ assigns to p is changed to k, and (iii) it creates a new interpretation F that 
is identical to I except that F{p') = k' . Such changes are repeated until either 



procedure Regular- WalkSAT 

Input: a regular CNF formula F, MaxChanges, MaxTries and uj 
Output: a satisfying interpretation of F, if found 
begin 

for i 1 to MaxTries 

7 := a randomly generated interpretation for F ; 
for j := 1 to MaxChanges 

if I satisfies F then return 7; 

Pick one unsatisfied clause C from F ; 

S ■- {{p,k)\ S'-.p£C,ke S' }; 

(p',fc') := select- WalkSAT( S, F, w ); 

I ■.= 1 with the truth assignment of p' changed to fc'; 

return “no satisfying interpretation found”; 
end 



Fig. 2. The Regular- WalkSAT procedure 
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a satisfying interpretation is found or a pre-set maximum number of changes 
(MaxChanges) is reached. This process is repeated as needed, up to a maximum 
of MaxTries times. 

Function select- WalkSAT calculates, for each pair {p, k) € S, the number of 
broken clauses; i.e. the number of clauses that are satisfied by / but that would 
become unsatisfied if the assignment of p is changed to /c. If the minimum number 
of broken clauses found (u) is greater than zero then either it randomly chooses, 
with probability w, a pair (p', k') from S or it randomly chooses, with probability 
1 — w, a pair (jp' , k') from those pairs for which the number of broken clauses is 
u. li u = 0, then it randomly chooses a pair from those pairs for which u = 0. 

To our best knowledge, the first implementations of local search algorithms 
for non-Boolean satisfiability were Regular-GSAT |B] and Regular- WalkSAT • 

In our experiments we used the last available version (10.0) of Regular- WalkSAT, 
which is faster than the previous ones. Recently, Frisch and Peugeniez [H] have 
considered a class of non-Boolean formulas where the signs of literals are single- 
tons, and have implemented an efficient local search algorithm for this kind of 
formulas. Their results show that using non-Boolean satisfiability encodings and 
solvers is a competitive generic problem solving approach. The reader is invited 
to consult [2 for related many-valued satisfiability problems and algorithms. 

4 Performance Evaluation 

A key question regarding Regular- WalkSAT and Regular-DP is how their perfor- 
mance compares to standard WalkSAT and DP. In this section, we consider four 
benchmark domains. Our results show that there is a concrete computational 
advantage to using the Regular-SAT procedures, at least on these domains. This 
suggests that the more compact Regular-SAT encodings, which also preserve 
more of the problem structure, allow for a more efficient, yet general, solution 
strategy. 

This section is divided in two parts. The first part summarizes our results on 
three benchmarks, graph coloring, round robin scheduling, and all interval series. 
These results were obtained as part of the first author’s Ph.D. dissertation [1]. 
Here we only present a summary of the main results. We refer to the thesis for a 
detailed description of the problem encodings and more detailed run time data. 

The second part gives a more detailed evaluation of our Regular- WalkSAT 
strategy on the quasigroup domain, which provides a structured benchmark with 
fine control of problem hardness. 



4.1 Graph Coloring, Round Robin, and All Interval Series 

Our problem domains are graph coloring (flat graphs [8] and DIMACS bench- 
mark instances), round robin scheduling [19] . and all interval series (ais) [Itij . 
The problems were selected not because of their inherent hardness per se, but 
because they are known to be hard to solve with SAT algorithms. 

For local search algorithms, we observed that the mean cost needed to solve 
an instance with Regular-WalkSAT is smaller than with WalkSAT in the three 
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problems. This was true in terms of both number of flips and time, although 
the difference in the time needed was not as significant as in the number of 
flipfl It was shown in |3] that the performance for the round robin problem is 
slightly better with Regular- WalkS AT, and is considerably better for the other 
two problems. 

Figure El shows the mean number of flips and mean time needed to solve 
instances of the ais problem of different size. The number of flips varies from 7 
times to 10 times smaller with Regular- WalkSAT and the time is always about 
2 times smaller with Regular- WalkS AT. 





Fig. 3. Scaling behaviour of Regular-SAT and SAT on the ais problem 



Local search algorithms for Regular-SAT are better in terms of the mean 
cost, but also in terms of the cost distribution as a whole. In fact, the cost 
distribution for Regular-SAT on a particular instance dominates the cost distri- 
bution for SAT on the same instance. In other words, the probability of finding 
a solution in less than x flips is always greater with Regular-WalkSAT for each 
X. Moreover, we observed that the computational cost follows an exponential 
distribution, at least when solving the instances with approximately optimal 
noise. This was observed before for local search algorithms for SAT [16|. Fig- 
ure 0] shows the distribution for the number of flips (RLD) for both algorithms 
when solving the DIMACS graph coloring instance DSJC125 . 5 . col. The figure 
also shows the exponential distributions (EDs) that were found to best approxi- 
mate the empirical distributions. The expression for the cumulative form of the 
EDs is ed[m]{x) = 1 — where m is the median of the distribution. The 

approximations were derived using the Marquart-Levenberg algorithm. 

For systematic search, we compared the performance of Regular-DP with the 
performance of DP when solving Regular-SAT and SAT encodings, respectively, 
of flat graph problem instances. When we say DP we mean our implementation 
of Regular-DP but working with T = {0, 1}. In order to study only the benefits of 
the encodings, both algorithms used the function of Equation 2 in the branching 
heuristic. Table [1] shows the mean cost needed to solve a flat graph instance 

^ However, we cannot consider our current version of Regular-WalkSAT (10.0) as 
optimized as the current one of WalkSAT (35.0). 
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Fig. 4. RLDs for Regular-SAT and SAT on instance DSJC125 . 5 . col 



with both approaches, as well as the coefficient of variation (CV) that is the 
ratio between the standard deviation and the mean of the cost. The table shows 
results for sets of instances obtained with different values for the number of 
vertices and the number of colors used. For 4 colours and 150 vertices, only 
10% of instances were solved with DP, and 85% of instances were solved with 
Regular-DP; in both cases we used a cutoff of 4 hours, and the results shown 
correspond to the instances successfully solved by both approaches. Observe that 
even if the number of nodes in Regular-DP is not smaller in all the cases, the 
time is always smaller. The likely explanation for this phenomenon is that the 
number of unit propagations per node is sufficiently small to compensate for a 
larger number of backtrack nodes. These results indicate that Regular-DP, using 
our simple branching heuristic, is more effective on the Regular-SAT encoding. 
In fact, the information contained in the regular literals may help the heuristic 
to make better decisions. We expect that by incorporating more sophisticated 
heuristics in Regular-DP (e.g. extensions of look-ahead in] and look-back |2] 
heuristics) we will extend the range and size of instances that Regular-DP can 
solve faster than state-of-the-art SAT solvers. 



4.2 Quasigroup Domain 

The quasigroup with holes problem (QWH) was recently introduced in jT]. This 
problem considers randomly generated instances of the quasigroup (or Latin 
square) completion problem (QCP) [I^, and all instances are satisfiable and 
thus well-suited for evaluating local search methods. The structure of QWH is 
similar to that found in real-world domains; for example timetabling, routing, 
and scheduling. Instances are generated by first randomly generating a complete 
quasigroup, and then erasing some of the colors of the quasigroup (punching 
“holes”). The hardness of completing a QWH instance can be finely controlled 
by the number of holes punched. With relatively few holes, a completion is easy 
because the problem is highly constrained; similarly, instances with a large frac- 
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Table 1. Results of DP and Regular-DP on fiat graph instances 



vertices = 100 vertices = 150 

Encoding Encoding 



colors 




classical 


regular 


classical 


regular 






mean CV 


mean CV 


mean CV 


mean CV 


3 


nodes 


349 1.05 


89 0.88 


6182 1.71 


597 1.12 




time (sec) 


0.70 1.04 


0.075 0.90 


19 1.57 


0.80 1.05 


4 


nodes 


133572 1.96 156303 1.97 


2457689 1.5 3861096 1.95 




time (sec) 


470 1.91 


191 1.88 


9955 1.3 


5920 1.83 



tion of holes are relatively easy to solve, since the instances are under-constrained 
and many possible completions exists. In [1], it is shown that there is a region 
of very hard completion problems in between these two extremes. The hard in- 
stances arise in the vicinity of a phase transition threshold in the average size of 
the so-called backbone [l]. The backbone of an instance measures the amount of 
shared structure among solutions pnrr] . In Section E] we show that Regular-SAT 
captures the backbone in a natural way. 

Encoding. We have encoded this problem using similar SAT and Regular-SAT 
encoding schemas. In the SAT encoding, each variable represents a color assigned 
to a particular cell, so if n is the order (or size) of the quasigroup, we have nP 
variables (n^ cells with n colors each). Then, we generate clauses that encode 
the following constraints: 

1. Some color must be assigned to each cell. 

2. No color is assigned to two cells in the same row. 

3. No color is assigned to two cells in the same column. 

The first constraint generates clauses of length n with positive literals, and the 
second and third ones generate binary clauses with negative literals. The total 
number of clauses generated is 

In the Regular-SAT encoding, each variable represents a cell of the quasigroup 
and the truth value assigned to it represents the color of the cell, so we have 
O(n^) variables and n truth values. Then, we generate clauses that encode the 
same constraints as in the SAT encoding, except for the first constraint. This 
constraint does not need to be stated explicitly in the Regular-SAT encoding, 
because a many- valued interpretation to the variables of the formula ensures that 
each cell receives exactly one color. For encoding the constraint that a particular 
color i cannot be assigned to two different cells cl and c2 of the same row (or 
column) we generate a regular clause of the form 

(|j— l:ci V + V 1:C2 V t* + l:c 2 ). 

By repeating this clause for all the possible colors, we ensure that cl and c2 do 
not receive the same color. The total number of clauses generated is also O(n^). 
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Local search results. In order to compare the typical performance of the 
Boolean SAT with the Regular-SAT approach, we solved hard QWH instances 
(j.e., at the phase transition boundary in the backbone) of different orders. For 
each order, we considered 100 instances and solved the SAT and Regular-SAT 
encodings using WalkSAT m and Regular-WalkSAT |3], respectively. Every 
instance was solved 100 times with both algorithms. The implementation of 
WalkSAT used is the one available in the SATLIB and the implementation of 
Regular-WalkSAT is the one used in |1] (implemented in C-|— k). 



Table 2. Median cost for SAT, Regular-SAT and CSP when solving hard QWH in- 
stances of different order (at the phase transition) 



order 


SAT 


flips 

Regular-SAT 


SAT 


time (seconds) 
Regular-SAT 


CSP 


27 


964, 849 


168,455 


2.1 


1.5 


1.7 


30 


2,985,105 


525,884 


7.2 


4.8 


6.9 


33 


11,123,065 


1,520,667 


27.1 


16.2 


57.1 


36 


30,972,407 


5,099,701 


70.8 


53.9 


1422.3 



Table |2] shows the median cost, in time and flips, of all the test-sets used. 
The cost for a particular instance is defined as the mean time and mean number 
of flips needed to find a solution. We have also included results for the median 
time when using a CSP-based systematic search algorithm implemented with the 
constraint programming library ILOG and that uses the all-different constraint 
and the R-brelaz-R randomized branching strategy [I12I21|2,*T| . The results show 
that the median cost is smaller for the Regular-SAT approach, although between 
SAT and Regular-SAT the difference is more significant in terms of the number 
of flips. The greater difference in the number of flips can be in part attributed 
to the fact that the Regular-SAT encoding is more compact in terms of the 
number of variables. However, this difference does not directly translate in an 
equivalent difference in overall run time because the flip rate (flips per second) 
in Regular-WalkSAT is lower than in WalkSAT. At least some of this difference 
can be attributed to a higher level of optimization of the WalkSAT code. Despite 
that our implementation of Regular-WalkSAT is not so optimized. Table El still 
shows that Regular-WalkSAT also outperforms the other approaches in overall 
run time. 

Figure O shows graphically the scaling bevahior in time and flips when we 
increase the order of the QWH instances. We see that the relative good perfor- 
mance of Regular-SAT scales up nicely with the order of the quasigroup. These 
results are consistent with the experimental results obtained with the other prob- 
lem domains tested in j3] and summarized in Section KT\ 
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Fig. 5. Scaling behaviour of the median hardness for Regular-SAT and SAT on QWH 
instances 




Fig. 6. Correlation between mean cost with Regular-SAT and SAT for order 27 (a = 
0.99, b = 0.81 and Ra = 0.97) and order 30 {a = 1.03, b = 0.57 and = 0.96). 
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log 10(mean # flips Regular-WalkSAT) 




Fig. 7. Correlation between mean cost with Regular-SAT and SAT for order 33 (a = 
1.00, b = 0.86 and = 0.91) and order 36 (a = 0.90, b = 1.44 and R^ = 0.86) 
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We have also performed a regression analysis to study the relation between 
the computational cost of the two different approaches for all the instances of 
a given test-set. This kind of analysis allows us to investigate to what extent 
the superior performance observed for the median instance is also observed for 
any randomly obtained instance within the test-set. Figure El (left) shows the 
results of the regression analysis performed with the instances of order 27. A 
least-mean-squares (1ms) regression analysis of the logarithm of the cost was 
performed. The figure shows the scatter plot, where each data point {x,y) rep- 
resents the logarithm, in base 10, of the mean number of flips performed by 
Regular- WalkSAT (x value) and the same quantity for WalkSAT {y value) when 
solving a particular instance. The figure also shows the linear equation obtained 
by the regression analysis {logio{y) = «log;^Q(a;) -I-&) and the adjusted coefficient 
of determination (i?^) that quantifies to what extent the model obtained fits the 
experimental data. Observe that by working with the logarithm of the data the 
actual functional relation we are fitting is y = (10^)-a:“. We see that, for order 27, 
a is close to 1 while lO*” is about 6.45. So, the relative performance increase for 
Regular-SAT holds uniformly for both easy, medium, and hard instances within 
the test-set. 

Figures El (right) and|7]give the correlation analysis results for orders 30, 33, 
and 36. Observe that the fit of the experimental data is better for the smaller 
orders. A possible explanation is that as the order increases, the variability 
in the hardness of QWH instances increases. To properly model the correlation 
between instance hardness and relative performance may require a more complex 
regression model. Nevertheless, our analysis still suggests that the increase in 
performance of Regular-SAT holds fairly uniformly across each test-set. 

Although the average complexity of solving instances from a problem domain 
distribution gives us a valuable information about the difficulty of the problem, 
the complexity of solving individual instances obtained with the same parame- 
ters can vary drastically from instance to instance. So, a more detailed analysis 
requires a study of the complexity of solving individual instances. To do so, 
we have constructed empirical Run-time distributions (RTDs) and Run-length 
(number of flips needed) distributions (RLDs) for both local search algorithms 
when solving the same instance. The methodology followed has been the one 
used in m- We have focused our attention on the median instance and the 
hardest instance of a given test-set. Here we present results for the test-set of 
quasigroups of order 33. Figure El shows the RLDs and RTDs for Regular-SAT 
and SAT on the median instance and also the RTD for CSP on the same instance. 
These empirical RLDs, in the cumulative form shown, give the probability that 
the algorithm finds a solution for the instance in less than the number of flips 
of the a;— axis (similarly in the RTDs). We observe that Regular-SAT strictly 
dominates SAT; i.e., the probability of finding a solution with Regular-SAT in 
less than x flips is always greater than the probability of finding a solution with 
SAT. Regular-SAT dominates the CSP approach even more significantly than 
SAT in the run time. Figure shows the same results but for the hardest in- 
stance of the same test-set. We observe a similar relative difference between the 
run time performance of the three approaches. 
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Fig. 8. RLDs (left) and RTDs (right) on the median instance for order 33 





flips time 



Fig. 9. RLDs (left) and RTDs (right) on the hardest instance for order 33 





Fig. 10. The average forward-checing backbone for Regular-SAT (left) and SAT (right) 
on QWH instances 
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5 The Backbone Structure 

We now consider the structure of the backbone in the QWH problem. Informally 
speaking, the backbone measures the amount of shared structure among the set 
of all solutions to a given problem instance [20]. The size of the backbone is 
measured in terms of the percentage of variables that have the same value in all 
solutions. Achlioptas et al. |T] observed a transition from a phase where the size 
of the backbone is almost 100% to a phase with a backbone size close to 0%. 
The transition is sudden and coincides with the hardest problem instances both 
for incomplete and complete search methods. 

For efficiency purposes, Achlioptas et al. also propose a slightly weaker ver- 
sion of the backbone, which is computed by only using forward-checking (FC) to 
find shared variable settings in the solution set. They show that this backbone is 
qualitatively similar to the original notion of backbone. We adapted the notion 
of SAT FC backbone for Regular-SAT, which is obtained by applying the one- 
literal rule to every regular literal of the formula and computing the fraction of 
the total number of variables that becomes constrained to a single truth value. 

The left panel of Figure [TOj shows the FC backbone for QWH instances of 
different orders and with a different number of holes for the Regular-SAT encod- 
ing. We observe a phase transition in the fraction of backbone variables for the 
Regular-SAT encoding. In contrast, the right panel of Figure [TO] displays the FC 
backbone structure for the Boolean SAT encoding. As we see from the figure, 
the SAT encoding does not properly preserve the phase transition properties of 
the backbone structure]^ The Regular-SAT encoding can capture a structural 
property such as the backbone more faithfully than the Boolean SAT encoding. 



6 Conclusions 

We have shown that Regular-SAT provides an attractive approach for encoding 
and solving combinatorial problems. The formulation provides an intermediate 
alternative to the SAT and CSP approaches, and combines many of the good 
properties of each paradigm. Its similarity to SAT allows us to extend existing 
SAT algorithms to Regular-SAT without incurring excessive overhead in terms of 
computational cost. We have shown, using a range of benchmark problems, that 
Regular-SAT offers practical computational advantages for solving combinato- 
rial problems. In addition, Regular-SAT maintains more of the original prob- 
lem structure compared to Boolean SAT encodings. By providing more powerful 
search heuristics and optimizing the data structures, we expect to further extend 
the reach of the Regular-SAT approach. 
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Abstract. Many real-world problems involve constraints that cannot 
be all satisfied. Solving an overconstrained problem then means to find 
solutions minimizing the number of constraints violated, which is an op- 
timization problem. In this research, we study the behavior of the phase 
transitions and backbones of constraint optimization problems. We first 
investigate the relationship between the phase transitions of Boolean sat- 
isfiability, or precisely 3-SAT (a well-studied NP-complete decision prob- 
lem), and the phase transitions of MAX 3-SAT (an NP-hard optimization 
problem). To bridge the gap between the easy-hard-easy phase transi- 
tions of 3-SAT and the easy-hard transitions of MAX 3-SAT, we analyze 
bounded 3-SAT, in which solutions of bounded quality, e.g., solutions 
with at most a constant number of constraints violated, are sufficient. 
We show that phase transitions are persistent in bounded 3-SAT and 
are similar to that of 3-SAT. We then study backbones of MAX 3-SAT, 
which are critically constrained variables that have fixed values in all op- 
timal solutions. Our experimental results show that backbones of MAX 
3-SAT emerge abruptly and experience sharp transitions from nonexis- 
tence when underconstrained to almost complete when overconstrained. 
More interestingly, the phase transitions of MAX 3-SAT backbones seem 
to concur with the phase transitions of satishability of 3-SAT. The back- 
bone of MAX 3-SAT with size 0.5 approximately collocates with the 
0.5 satisfiability of 3-SAT, and the backbone and satisfiability seems to 
follow a linear correlation near this 0.5-0. 5 collocation. 



1 Introduction and Overview 



Understanding phase transition phenomena in complex systems and combina- 
torial problems [3111112113115116123124] has been an active research focus for 
more than a decade. It is now well known that Boolean satisfaction problems 
typically exhibit easy-hard-easy phase transitions mm- Specifically, the com- 
putational complexity of 3-SAT, a Boolean satisfaction problem in a conjunctive 
normal form with three literals (variable or its negation) per clause, experiences 
dramatic transitions from easy to difficult and then from difficult back to easy 
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when the ratio of the number of clauses to the number of variables increases. 
Note that 3-SAT is a decision problem, which gives a solution when the problem 
is satisfiable or an answer NO when it is unsatisfiable. 

On the other hand, it has also been shown that the expected complexity 
of finding optimal solutions of tree search problems, which include many of 
those combinatorial optimization problems that are solved by branch-and-bound 
methods, goes through easy to difficult transitions when the underlying heuristic 
functions degenerate I18I15I28I24I . 

In short, the phase transitions of some NP-complete decision problems have 
easy-hard-easy patterns and the phase transitions of some NP-hard optimization 
problems follow easy-hard patterns. These phase transition results exhibit a 
discrepancy between the phase transitions of decision and optimization problems. 

An example of such a discrepancy is explicitly shown in two independent 
experimental study of phase transitions of the Traveling Salesman Problem 
(TSP) |1 0|25J . It was shown that there exists a rapid transition between soluble 
and insoluble instances of the decision problem of two-dimensional Euclidean 
TSP, and hard instances are associated with this transition, showing an easy- 
hard-easy pattern HO]. On the other hand, it was shown that the complexity of 
finding optimal solutions to the TSP displays an easy-hard pattern ESj. 

Phase transitions of different problems have different control or order param- 
eters that may be adjusted to alter the phases of the problems. For instance, an 
order parameter for 8-SAT is the ratio of the number of clauses to the number 
of variables PUS! and the number of distinct values of intercity distances is an 
order parameter for the TSP m- 

A more profound concept related to phase transitions is that of the backbone, 
which has been suggested as a more pertinent order parameter to characterize 
a complex problem. For examples, a backbone of a Boolean formula is the set of 
literals that are true in every model El; and a backbone of a fc-coloring problem 
is defined to be the set of pairs of nodes each of which has the same color in 
every possible k coloring [S|. In other words, backbone variables are extremely 
constrained. A violation to a backbone variable rules out all optimal solutions. 

This research was first motivated by the fact that there are numerous real- 
world constraint problems for which not all constraints can be satisfied. Such 
problems can be found in application areas such as scheduling, multi-agent coop- 
eration and coordination, and pattern recognition mn\ . Given such an overcon- 
strained problem, the task of finding an solution to minimize the total number 
of violated constraints is an optimization problem or constraint optimization 
problem. 

We are also motivated to understand the relationship between the phase 
transitions of decision problems and that of their optimization counterparts. In 
this study, we will focus our investigation on 8-SAT, which is a decision problem, 
and MAX 8-SAT, which is an optimization problem that requires the optimal 
solutions minimizing the total number of unsatisfied constraints. 

Furthermore, We are motivated to investigate the backbones of optimization 
problems, particularly the backbone of MAX 8-SAT. Our goal is to understand 
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the characteristics of all optimal solutions and the behavior of algorithms for 
finding them. 

The paper is organized as follows. After a brief review of 3-SAT and MAX 
3-SAT, we examine the phase transitions of 3-SAT and MAX 3-SAT by show- 
ing their different phase transition patterns (Section 12.21) . We then generalize 
the notion of satisfiability to different decision problems with various bounds 
on decision quality fSection l2.3|l . We then study the backbone of MAX 3-SAT 
(Sectional). We discuss related work in Section S] and conclude in Sectional 



2 Decision vs. Optimization Phase Transitions 

In this section, we experimentally analyze the relationship between the phase 
transitions of decision problems and that of optimization problems using 3-SAT 
and MAX 3-SAT. 

In our experiments on 3-SAT and MAX 3-SAT, we used 25 variables and var- 
ied the number of clauses to generate random problem instances. We restricted 
ourselves to this set of relatively small problems because of the following reasons. 
First, as we will see in the rest of the paper, MAX 3-SAT is much more difficult 
to solve than 3-SAT. Second, to find the backbones of these problems, we need 
to find all optimal solutions, which is substantially more difficult than finding 
just one solution. Third, this study is a statistical, experimental investigation so 
that we need to collect data from a relatively large pool of problem instances. 
Nevertheless, we have done experiments on 50 variable problems using a few 
dozen instances, and observed similar phenomena reported here. 

In generating a clause, a randomly chosen variable has a 50 percent chance 
to be negated. No duplicate clause is allowed in a problem instance. We var- 
ied the clause/ variable ratio from 1 to 20, with an increment of 0.2. For each 
clause/ variable ratio, we generated 1,000 problem instances. We collected the 
median value or computed an averaged value of the results on these instances as 
needed. 

In this study, we used the well-known Davis-Putman-Loveland (DPL) 
method, a backtracking method with unit resolution 0. This algorithm is a 
special case of depth-first branch-and-bound where one variable is instantiated 
at each step. We extended the method to handle both 3-SAT and MAX 3-SAT. 
Due to space limit, we leave the detail of our extension to another report of the 
research. 



2.1 3-SAT and MAX 3-SAT Problems 

A Boolean satisfiability, or SAT for short, is a constraint satisfaction problem 
(CSP) that involves a Boolean formula consisting of a set of Boolean variables 
and a conjunction of a set of disjunctive clauses of literals, which are variables 
and their negations. A clause is satisfied if a literal within it takes a true value, 
and a Boolean formula is satisfied if all the clauses are satisfied. The conjunction 
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defines constraints on the possible combinations of variable assignments. A 3- 
SAT is a special Boolean satisfiability where each clause has three literals. 3- 
SAT is NP-complete and it is unlikely to have a polynomial algorithm for the 
problem [8] . Many practical problems can be cast as SAT 12122 ]. 

Furthermore, there are also practical SAT problem in which no variable as- 
signments can be found which does not violate a constraint [7|. In this case, it 
is required to find an assignment such that the total number of satisfied clauses 
is maximized. This is called maximum 3-Sat. 

2.2 Discrepancy of Phase Transitions 

As discussed in Section [I] there is a discrepancy between the phase transitions of 
decision problems and the phase transitions of their corresponding optimization 
versions. We investigate this discrepancy in detail. 

We first consider 3-SAT. Figure[Ushows two types of phase transitions, a tran- 
sition between satisfiability and unsatisfiability and easy-hard-easy transitions of 
computation cost. The order parameter that determines the phase transitions is 
the ratio of the number of clauses to the number of variables. The critical value 
of this order parameter for 3-SAT is around 4.13 P].A 3-SAT is almost always 
satisfiable when the clause/variable ratio is below this critical value and is almost 
always unsatisfiable when the ratio is beyond the critical value, making a sharp 
transition from satisfiability to unsatisfiability. Furthermore, the computational 
complexity required to decide the satisfiability is low when the probability of 
satisfiability is close to one or zero; while the complexity is the highest when 
this probability is 0.5, a value taken when the clause/ variable ratio is around 
4.13. 

We now consider MAX 3-SAT. The only property we need to consider is 
its computational complexity, since an optimal solution is required throughout 
the whole spectrum of consideration so that there is no notion of satisfiability 
for the problem. Figure |2] shows the complexity of solving random MAX 3-SAT 
with 25 variables and various numbers of clauses. The problem instances used 
in Figure!^ are the same as that in Figure |T] To contrast the result with 3-SAT, 
we also include the complexity curve for 3-SAT. Figure shows that starting 
at point A in the figure, MAX 3-SAT follows 3-SAT to enter computationally 
difficult region. However, MAX 3-SAT becomes more and more difficult when the 
clause/ variable ratio increases even when 3-SAT enters its second easy region. 
In other words, the complexity of MAX 3-SAT follows an easy-hard pattern as 
the clause/ variable ratio increases. 

The discrepancy between the different patterns of the complexity phase tran- 
sitions of 3-SAT and MAX 3-SAT indicates that optimizing is more difficult than 
making decision. The optimal solution to a MAX 3-SAT can be obviously used 
to answer the question if the corresponding 3-SAT is satisfiable or not. Thus 
a MAX 3-SAT, an optimal problem, is at least as hard as its corresponding 3- 
SAT, a decision problem. This discrepancy also indicates that constraints play 
different roles in an optimization problem and in its decision counterpart. When 
a problem instance is satisfiable, deciding if it is satisfiable is to find a variable 
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Fig. 1. Phase transitions of 3-SAT. 




Fig. 2. Phase transitions of MAX 3-SAT. 



assignment satisfying all the constraints, which is also an optimal solution to 
the optimization version of the problem. When a constraint problem is overcon- 
strained, a small subset of the problem is very likely to be overconstrained as 
well, so that the problem can be declared unsatisfiable when such an overcon- 
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strained subproblem is detected unsatisfiable. The more constrained the problem 
is, the more quickly the decision process can conclude that no solution exists. 
However, in an overconstrained case, finding an optimal solution to minimize 
the total number of violated constraints is typically hard since every possible 
variable assignment can be a candidate of a optimal solution. 

2.3 Quality-Bounded Decision Problems 

The discrepancy between the two different phase transition patterns of 3-SAT 
and MAX 3-SAT has motivated us to investigate the relationship of the phase 
transitions of these two closely related problems. 

In between a decision problem and its optimization counterpart there are 
many middle grounds that consist of decision problems with different decision 
objectives and quality. Such a decision problem may ask if there exists a variable 
assignment that violates no more than B constraints for an integer bound B. 
We call such a general decision problem quality-bounded decision problem, or 
bounded decision problems for short, and note it as 3-SAT(H). A 3-SAT(H) is 
satisfied if an assignment that violates no more than B constraints exists. It 
takes 3-SAT and MAX 3-SAT as special cases. When B = 0 it is 3-SAT; when 
B is the optimal solution cost, it is equivalent to MAX 3-SAT. 

Are the phase transition properties of 3-SAT reserved under the general no- 
tion of satisfiability? Specifically, are there still a sharp transition from satisfia- 
bility to unsatisfiability and easy-hard-easy complexity transitions in 3-SAT(B) 
when the clause/ variable ratio increases? 

Figures [ 3 ] and 21 show our experimental results that answer these questions. 
Figure[3]shows the probability of satisfiability of 3-SAT(0), 3-SAT(5), 3-SAT(10), 
3-SAT(15), and 3-SAT(20). The figure shows that 3-SAT(B) still has a sharp 
transition from satisfiable to unsatisfiable as the clause/variable ratio increases. 
The location of the transition for a given clause/variable ratio depend on B, 
however. The larger B is, the more problem instances are satisfiable. Similar to 
the unsettled issue of the exact location of the satisfiable to unsatisfiable transi- 
tion of 3-SAT, it remains an interesting open problem to analytically determine 
the transition location of 3-SAT (B) with a non-zero integer bound B. 

Figure Slshows the computational complexity of an extended Davis-Putman- 
Loveland (DPL) algorithm on 3-SAT(0), 3-SAT(5), 3-SAT(10), 3-SAT(15), 3- 
SAT(20), and MAX 3-SAT. Note that the vertical axis is in a logarithmic scale. 
As the curves in the figure show, although the complexity of 3-SAT(B) still 
follows an easy-hard-easy transition pattern, the second easy region where prob- 
lem instances are overconstrained becomes no longer very easy comparing to the 
first easy region where the problems are underconstrained. The larger B is, the 
computationally more difficult the second easy region becomes. 

In short, a profound feature of phase transitions on computational complexity 
of 3-SAT(B) is that the transition from the first easy region to the difficult region 
is very sharp. In the first easy region, the computational complexity of 3-SAT(B) 
is relatively constant regardless the actual value of B. Whenever the complexity 
enters the difficult region, the complexity increases exponentially. 
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Fig. 3. Satisfiability phase transitions of 3-SAT(_B). 




Fig. 4. Complexity phase transitions of 3-SAT(_B). 



More importantly, Figure [H shows that the complexity curve of MAX 3-SAT 
is an upper envelop of the complexity curve of 3-SAT(_B). Furthermore, the peak 
of the complexity of 3-SAT(_B) is at the location where the quality bound B is 
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near the the optimal solution cost of MAX 3-SAT, and this peak is close to the 
complexity of the corresponding MAX 3-SAT. 



3 Phase Transitions and Backbones 

We now study the phase transitions and backbones of decision and optimization 
problems. We investigate in particular various phase transition behavior and 
backbones of optimal solutions of 3-SAT and MAX 3-SAT. 

In our experiments, we used the same set of randomly generated problem 
instances as for the experiments in the previous section. Specifically, we used 
25 variables and varied the number of clauses by changing the clause/ variable 
ratio from 1 to 20, with an increment of 0.2. For each clause/ variable ratio, we 
generated 1,000 problem instances. We collected the median value or computed 
an averaged value of the results on these instances as needed. We also used the 
same extended DPL algorithm in these experiments. 

3.1 Phase Transitions Related to Optimal Solutions 

It is known that there are a large number of satisfying solutions whenever 3- 
SAT is underconstrained, which are credited for the low computation cost in the 
underconstrained region. These satisfying solutions are also optimal solutions 
to MAX 3-SAT. When 3-SAT is overconstrained, however, a satisfiable solution 
is unlikely to exist. What is the total number of optimal solutions when MAX 
3-SAT is overconstrained? How will the other two important characteristics, 
the cost of optimal solutions and the computational cost of finding all optimal 
solutions, behave? 

Figure!^ shows the median number of optimal solutions of 3-SAT and MAX 
3-SAT with 25 variables in terms of clause/ variable ratio. The dotted line in the 
figure is where the 50 percent satisfiability of 3-SAT occurs. The vertical axis 
is in a logarithmic scale. As Figure El shows, the curve of the average number 
of optimal solutions can be divided into two segments. In the underconstrained 
region, where the satisfiable instances dominate, the average number of solu- 
tions decreases exponentially as the clause/ variable ratio increase. In the over- 
constrained region, the number of solutions is less than a dozen, and decreases 
approximately linearly with the clause/ variable ratio. 

Another characteristic factor associated with finding all solutions is the cost 
of optimal solutions, which is shown in Figure El The cost curve also has two 
segments, separated again by the 50 percent satisfiability line, the dotted line in 
the figure. In the underconstrained region, the median number of violated clauses 
remains zero; while in the overconstrained region, the cost increases linearly with 
the clause/ variable ratio. 

We now examine the computational costs of finding all optimal solutions. 
Figure 0 shows the experimental results. The median computational costs are 
shown in a logarithmic scale along the vertical axis. The overall computational 
curve is again separated by the 50 percent satisfiability point of 3-SAT, which 



Phase Transitions and Backbones of 3-SAT and Maximum 3-SAT 



161 




Fig. 5. Number of optimal solutions of 3-SAT and MAX 3-SAT. 




Fig. 6. Cost of optimal solutions of MAX 3-SAT. 



is shown by the vertical dotted line in Figure 0 The major trend of the curve 
in the underconstrained region is an exponential drop. This differs significantly 
from the low, increasing computational cost for finding one satisfiable solution of 
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Fig. 7 . Computational cost of 3-SAT and MAX 3-SAT. 



3-SAT in this region as shown in Figure [I] The higher computational cost when 
the ratio is smaller is mostly due to enumerating the large number of optimal 
solutions (cf. Figure [S|. When the clause/ variable ratio passes through the 50 
percent satisfiability separation point, the computational cost steadily increases 
exponentially. 

If finding a single satisfiable solution to a 3-SAT at the 50 percent satisfiability 
point is considered difficult (cf. Figure HJ, then finding all solutions of 3-SAT and 
MAX 3-SAT is a much harder problem. Based on Figure 0, the cost for finding 
all solutions around the 50 percent satisfiability point is near the lowest. 

In summary, the three main features associated with finding all optimal solu- 
tions of 3-SAT and MAX 3-SAT, the number of optimal solutions, the computa- 
tional cost and the cost of optimal solutions, are segmented by the 50 percent sat- 
isfiability point of 3-SAT, and follow different patterns in the underconstrained 
and overconstrained regions. 

3.2 Backbone Phase Transitions 

A backbone of a 3-SAT is a fraction of literals that have fixed values in all 
satisfying solutions In parallel, a backbone of a MAX 3-SAT is the fraction 
of literals that have fixed values in all optimal solutions. In short, backbone 
variables are critically constrained. A violation to any of these variables will rule 
out any optimal solution. 

The size of a backbone can be normalized to a real number ranging from 0 
to 1. A normalized backbone of 0 means that no variable is a backbone variable; 
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while a normalized backbone of size 1 means that all variables are backbone 
variables. 

Our study of MAX 3-SAT backbones revealed two interesting results. First, 
there exist phase transitions of the backbones, shown in Figure |8] where the 
normalized median backbone sizes of 1,000 3-SAT and MAX 3-SAT problem 
instances are included. As the figure shows, backbones emerge abruptly as the 
clause/ variable ratio increases. When the clause/ variable ratio is less than 3.6, 
backbones almost do not exist. When the ratio is more than 3.6, backbones 
emerge quickly. Before the clause/ variable ratio gets to 6, the median backbone 
size grows to more than 0.7, and reaches more than 0.9 when the ratio is 11. 
The first also shows that the backbone size of 3-SAT grows faster than that of 
MAX 3-SAT as the clause/variable ratio increases. 

The second interesting, and a little bit surprising, result is that the backbone 
phase transitions of MAX 3-SAT are coincident with the satisfiability phase 
transitions of the corresponding 3-SAT. This is shown in Figure [H The loca- 
tion where the backbone of MAX 3-SAT is 0.5 concurs approximately with the 
location where the corresponding 3-SAT has a probability 0.5 to be satisfiable. 
Within the vicinity near this 0.5-0. 5 collocation (the dotted square within Fig- 
ure El , the backbone of MAX 3-SAT and the satisfiability of 3-SAT seem to 
be linearly correlated. An increase in backbone will cause the probability of 
satisfiability to drop proportionally, and vice versa. 

There are only a few optimal solutions when the clause/variable ratio is very 
large, as shown by Figure^ The backbone size is large when the clause/variable 




Fig. 8. Phase transitions of 3-SAT and MAX 3-SAT backbones. 
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Fig. 9. Collocation of 0.5-0. 5 transition points and near linear relation. 



ratio is large, as shown in Figure The combination of these two factors indi- 
cates that a handful of optimal solutions are clustered in a small neighborhood. 
Therefore, searching for any one of the clustered optimal solutions is difficult 
when backbone is large, since it is more likely to make a mistake of not setting a 
backbone variable to its correct value. On the other hand, when there exists no 
backbone variable, an arbitrary variable assignment may be an optimal solution. 
Therefore, finding such an optimal solution is easy. 

4 Related Work 

Huberman and Hogg discussed and argued that phase transitions are a universal 
feature of complex systems and problems m- Cheeseman et. al. |3] first exper- 
imentally demonstrated the existence of phase transitions in many combinato- 
rial decision problems, including Boolean satisfiability, the Traveling Salesman 
Problem and graph coloring. The phase transitions of 3-SAT were extensively 
examined by Mitchell et. al. m and many other authors. This line of work 
concentrated mainly on decision problems. One of the main results is that the 
average computational complexity of decision problems follows an easy-hard-easy 
pattern. 

The study of phase transitions of optimization problems probably started 
with Karp and Pearl’s work of best-first search on a special random tree [13j . This 
random tree is an abstract model of many combinatorial search problems and 
state-space search algorithms, including best-first search and depth-first branch- 
and-bound. This work was extended to a more general tree by McDiarmid US- 



Phase Transitions and Backbones of 3-SAT and Maximum 3-SAT 



165 



Zhang and Korf expanded the work to various linear space search algorithms, 
including depth-first branch-and-bound and iterative deepening |23I24J . A main 
conclusion of this line of research is that the expected computational complexity 
of optimization problems typically exhibits an easy-hard pattern. 

The discrepancy between the easy-hard-easy phase transitions of decision 
problems and the easy-hard transitions of optimization problems has inspired 
us to investigate the relationship of these two types of problems and their phase 
transitions closely in this research. This complements the previous research on 
phase transitions of decision and optimization problems mm- One of the results 
of this research reconciles the relationship between the phase transitions of these 
two types of combinatorial problems, especially 3-SAT and MAX 3-SAT. In 
addition, we also show that the curves of the computational cost of bounded 
3-SATs are upper bounded by the curve of computational cost of MAX 3-SAT. 

Backbone seems to be an old concept, studied by Kirkpatrick and Toulouse 
on the Traveling Salesman Problem | 14| . and attracting much attention recently. 
Monasson et. ah, investigated the backbones of 3-SAT and (2-|-p)-SAT and sug- 
gested backbone as an order parameter for the decision problems m- Culberson 
and Gent extended the concept of backbones to graph coloring |S]. Achilop- 
tas also considered the backbones of quasigroup complete problems [I]. Slaney 
and Walsh studied the backbones of many combinatorial optimization and ap- 
proximation problems, such as graph coloring, the Traveling Salesman Problem, 
number partitioning and blocks world planning m- The relationship between 
backbone and local search on 3-SAT was studied by Parkes m and Singer et. 
al. [in]. 

Compared to the existing work on backbone, we made two main contributions 
in this research. The first is the result of the collocation of the 0.5 backbone of 
MAX 3-SAT (an optimization problem) and the 0.5 satisfiability of 3-SAT (a 
decision problem) . The second is the result of the near linear correlation between 
these two phase transitions of two different but closely related problems. 



5 Conclusions 

We draw two conclusions from this research on constraint satisfaction and con- 
straint optimization problems. First, phase transitions are persistent in bounded 
3-SAT (3-SAT(B)) in which up to B constraints may be violated. We showed that 
deciding if there exists a variable assignment with no more than B constraints 
unsatisfied exhibits similar phase transitions as that in 3-SAT, i.e., dramatic 
satisfiable to unsatisfiable transitions and easy-hard-easy computational com- 
plexity phase transitions. However, the difficulty of the second computationally 
easy phase in 3-SAT(H) increases with the quality bound B. Furthermore, the 
computational cost of MAX 3-SAT envelops the computational cost peaks of 
3-SAT(B). 

Second, the backbones of 3-SAT MAX 3-SAT also experience phase transi- 
tions. A backbone is almost not existent in the underconstrained region, abruptly 
emerges when moving toward the critically constrained region, and quickly in- 
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creases to almost a full size in the overconstrained region. The backbone of MAX 
3-SAT with size 0.5 appears approximately at the location where 3-SAT is sat- 
isfiable with probability 0.5. Near this 0.5-0. 5 phase transition collocation, the 
backbone of MAX 3-SAT and the satisfiability of 3-SAT seems to be linearly 
correlated. 

This research makes two contributions. First, it reconciles the relationship be- 
tween the phase transitions of decision and optimization problems, which were 
discovered different problem domains, bridging the gap of the previous phase 
transition results on these two types of problems. Second, it suggests that back- 
bone in the solutions of optimization problems is an order parameter for the 
problems. 

This work also gives rise to many interesting open questions for future re- 
search. For instance, where is the exact phase transition location of bounded 
3-SAT(B)? Why does the backbone of MAX 3-SAT with size 0.5 collocate with 
50 percent satisfiability of 3-SAT? Why does the backbone appear to have a 
linear correlation with the satisfiability? 
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Abstract. Non-binary constraint satisfaction problems (CSPs) can be 
solved in two different ways. We can either translate the problem into 
an equivalent binary one and solve it using well-established binary CSP 
techniques or use extended versions of binary techniques directly on the 
non-binary problem. Recently, it has been shown that the hidden vari- 
able encoding is a promising method of translating non-binary CSPs into 
binary ones. In this paper we make a theoretical and empirical investi- 
gation of arc consistency and search algorithms for the hidden variable 
encoding. We analyze the potential benehts of applying arc consistency 
on the hidden encoding compared to generalized arc consistency on the 
non-binary representation. We also show that search algorithms for non- 
binary constraints can be emulated by corresponding binary algorithms 
that operate on the hidden variable encoding and only instantiate orig- 
inal variables. Empirical results on various implementations of such al- 
gorithms reveal that the hidden variable is competitive and in many 
cases better than the non-binary representation for certain classes of 
non-binary constraints. 



1 Introduction 

The majority of the research on constraint satisfaction problems (CSPs) has 
focused on algorithms and heuristics that are applied on binary problems. The 
main reason for this is that any problem that contains constraints of an ar- 
bitrary arity can be transformed to an equivalent binary problem m- In the 
past, research on non-binary CSPs has mainly dealt with filtering algorithms. 
Recently, it is being recognized that more research on other non-binary issues 
is also required. As a result, search algorithms for binary CSPs have been ex- 
tended for non-binary ones (0) and the efficiency of binary encodings has been 
investigated ( inw i)- 

The most popular binary translations are the dual graph encoding and the 
hidden variable encoding. It is not clear which of the two is the best. However, 
the hidden variable encoding has some nice theoretical properties which make it 
a promising technique in many cases mm- First, arc consistency (AC) on this 
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binary representation achieves the same consistency level as generalized arc con- 
sistency (GAC) on the non-binary problem. This means that MAC (i.e., main- 
taining arc consistency) applied on the hidden variable encoding of a non-binary 
CSP visits the same search tree nodes as MGAC (i.e., maintaining generalized 
arc consistency) on the non-binary representation. Second, enforcing AC on an 
arbitrary encoded non-binary constraint takes the same number of consistency 
checks in the worst-case as GAC on its non-binary representation. These the- 
oretical results, indicate that the hidden variable encoding is a promising way 
of solving non-binary CSPs with MAC. In practice, we can only use the hidden 
variable encoding on CSPs that have tight constraints. For CSPs with a large 
number of loose constraints it is reasonable to assume that the hidden variable 
encoding will be inefficient due to the large space requirements. It has also been 
shown experimentally that solving the binary encoding of a non-binary CSP can 
be less efficient than applying a non-binary version of some search algorithm, 
and vice versa, depending on the tightness of the constraints [ma- 
in this paper we take a closer look on arc consistency and search algorithms 
for the hidden variable encoding. The difference between an arc consistency 
algorithm on the encoding and a generalized arc consistency algorithm is the 
fact that the former has to update the domains of the hidden variables as well as 
the original ones. We show that this can lead to an arc consistency algorithm that 
runs on the encoding and, for any arc consistent graph, performs exactly the same 
number of consistency checks as the corresponding generalized arc consistency 
algorithm. For arc inconsistent graphs we show that the AC on the encoding can 
detect the inconsistency earlier and thus perform fewer checks than GAC. In a 
special case, the algorithms are equivalent not only in consistency checks but also 
in all the primitive operations they perform (e.g. domain lookups and deletions). 
In general, there is a trade-off between the binary and non-binary algorithms in 
the amount of primitive operations they perform. We also show that, like MGAC, 
the generalizations of forward checking to non-binary CSPs can be simulated 
by a corresponding binary forward checking algorithm on the hidden variable 
encoding that only instantiates original variables, resulting in the same node 
visits. We make an empirical comparison of different implementations of binary 
and generalized algorithms which reveals that the hidden variable encoding can 
be competitive and often better than the non-binary representation in certain 
classes of tight non-binary CSPs. 

2 Background 

A constraint satisfaction problem (CSP) V is defined by a triple {X,T>,C). X is 
a set of n variables. Each variable Xi & X takes values from a domain Di G T>. 
C is a set of e constraints. Each fc-ary constraint is defined over an ordered set 
of variables {a;i, . . . ,Xfe} by a subset of the Cartesian product Di x . . . x 
that specifies the set of allowed value combinations (tuples). A constraint can 
be defined either extensionally by the set of allowed tuples or intensionally by a 
predicate or arithmetic function. In the following we will assume that all non- 
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binary constraints are defined extensionally by nature, or can be represented 
extensionally without excessive space requirements. We also assume that there 
is at most one constraint per variable combinationj^ 

A value a in the domain D of variable x is consistent with a constraint c 
if X is not included in the variables of the constraint, or if it is included and 
there exists a valid tuple t in c where x = a. In the latter case we say that r is a 
support for a in c. Checking whether a tuple is a support for a variable value pair 
(x, a) is called a consistency check. A variable x is consistent with a constraint c 
a D ^ $ and all its values are consistent with c. A constraint c is arc consistent 
(AC) if Vxi G X, Xi is consistent with c. A binary CSP is arc consistent if all its 
constraints are arc consistent. A CSP is singleton arc consistent (SAC) iff it has 
non-empty domains and for any instantiation of a variable, the problem can be 
made arc consistent. We call the generalizations of AC and SAC to non-binary 
CSPs GAC and SGAC respectively. Finally, a solution to a CSP is an assignment 
of values to variables which are consistent with all constraints. 

Following [H| , we call a local consistency property A stronger than B iff for 
any problem A deletes at least the same values as B, and strictly stronger iff it 
is stronger and for at least one problem A deletes more values than B. We call 
A equivalent to B iff they delete the same values for all problems. Similarly, we 
call a search algorithm A stronger than an algorithm B iff for every problem 
A visits at most the same search tree nodes as B, and strictly stronger iff it is 
stronger and for at least one problem A visits less nodes than B. A is equivalent 
to B iff they visit the same nodes for all problems. 



2.1 Hidden Variable Encoding 

The hidden variable eneoding E] is a well-known method for transforming a 
non-binary CSP to a binary one. It encodes the non-binary constraints to vari- 
ables (called “hidden” variables) that have as domain the valid tuples of the 
constraint. For each tuple in the domain of the hidden variable Vc, the encoding 
introduces compatibility constraints between Vc and each original variable Xi in 
the constraint c. Each constraint specifies that the tuple assigned to Vc is con- 
sistent with the value assigned to Xi. Consider the following example with six 
variables with 0,1 domains, and four constraints: xi+X2+xq = 1 , a;i— a:3-|-a:4 = 1 , 
Xi + x^ — xq > 1 , and X2 + x^ — xq = 0 . In the hidden variable encoding (Fig- 
ure [ 1 } there are, in addition to the original six variables, four hidden variables. 
The domains of these hidden variables are the tuples that satisfy the respective 
constraint. For example, the hidden variable associated with the third constraint 
U3 has the domain {(0, 1, 0), (1, 0, 0), (1, 1, 0), (1, 1, 1)}, as these are the tuples of 
values for (x4, X5 ,xq) which satisfy X4 + X5~xe > 1 . There are now compatibility 
constraints between V3 and X2, between V3 and X5 and between V3 and xq, as 
these are the variables mentioned in the third constraint. 

^ Multiple constraints on the same set of variables can be reduced to a single constraint 
in the extensional representation. 
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Fig. 1. Hidden variable encoding of a non-binary CSP. The binary constraint ri applies 
to a tuple and a value and is true iff the ith element of the tuple equals the value. 



3 Arc Consistency 

In this section we study the relationship between AC on the hidden variable en- 
coding and GAC in more detail by examining the benefits of revising the domains 
of hidden variables. We will show that these revisions can help an AC algorithm 
on the encoding to identify inconsistencies earlier than the corresponding GAC 
algorithm. 



3.1 GAC Algorithms 

GAC-4 [To] is designed for constraints represented in extension by their allowed 
tuples. Each time a value a is deleted from a variable x, the tuples that include 
this variable- value pair are also deleted from the lists of allowed tuples. The 
deletion of these tuples may trigger the deletion of further values that lose their 
support, and so on. We can view this algorithm as a binary algorithm that runs 
on the hidden variable encoding. The only modification we need to make is to 
consider a constraint c as a hidden variable and the set of allowed tuples of 
c as the domain of h^- The propagation of deletions can then be done in exactly 
the same way resulting in the same primitive operations as in the non-binary 
caseEl By primitive operation we mean a domain lookup (i.e, check if a value is 
in the domain of a variable), a deletion of a value (or a tuple), a consistency 
check, and any other check in a list or other data structure. 

GAC-3 is an extension of the well-known AC-3 algorithm to non-binary CSPs. 
When a value is deleted from a variable, GAC-3 adds to a stack all constraints 
that involve that variable. Then, constraints are removed from the stack and are 
“revised” . Revising a constraint means searching for a new supporting tuple for 
the values of all variables in the constraint. Checking whether an variable- value 
assignment is consistent with respect to a constraint c = (xi, . . . ,Xk) involves 

This equivalence has been pointed out by Christian Bessiere at CP’99. 
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finding all tuples < oi , . . . , > in c that contain this assignment and checking 

if values oi, . . . , Ofc are still in the domains of variables xi, . . . , Xfc. The reason 
for this is that GAC-3 like algorithms, in their standard implementation, do 
not make updates in the lists of allowed tuples like GAC-4 does when a value 
is deleted. So, they cannot check directly if tuple < ai,...Ofc > is still valid. 
This results in extra operations compared to GAG-4, but on the other hand 
GAG-3 like algorithms avoid updating the usually large sets of allowed tuples 
(i.e., hidden variable domains) and require less space. Like GAG-4, a GAG-3 
algorithm that updates the lists of allowed tuples can be viewed as a binary 
algorithm that operates on the hidden variable encoding. GAG-schema jS] is 
another GAG algorithm that does not update the allowed tuples but instead 
looks for supports in a similar, but more sophisticated, way as GAG-3. 

Recently, the binary AG-3 algorithm has been modified to yield an algorithm 
with optimal worst-case time complexity mm- What makes the new AG-3 al- 
gorithms optimal is the use of a pointer currentSupportx^a,c^y for each value a 
of a variable x involved in a constraint c between x and y. This pointer records 
the current value in the domain of y that was found to be a support of a. After 
a value deletion, if we look for a new support for a in y, we first check if the 
value where currentSupportx,a,c^y points is still in the domain of y. If not, we 
search for a new support starting from the value immediately after the current 
support. Assuming that the domains are ordered, [fill4] prove that the new al- 
gorithm is optimal. This algorithm can be extended to non-binary constraints 
in a straightforward way. Again, we can use a pointer currentSupportx,a,c that 
points to the last tuple (assuming an ordering of the tuples) in constraint c that 
supported value a of variable x, where x is a variable involved in c. A sketch of 
the main functions of the algorithm, omitting the initialization phase, is shown 
in Figure |5] We now briefly discuss the complexity of this algorithm. 

Like GAG-3, when a variable-value pair (x,a) is deleted, each constraint 
involving x is pushed on the stack. Then, constraints are popped from the stack 
and revised. Each fc— ary constraint can be revised at most kd times, one for 
every deletion of a value from the domain of one of the k variables. Since we use 
the pointers currentSupportx,a,ct for each variable- value pair (x, a) we can check 
at most subtuples to find a support E] This results in 0{kdd^~^) checks for 
one constraint in the worst-case. For e constraints the worst-case complexity, 
measured in consistency checks, becomes 0{ekd^). To check if a tuple is valid, 
in lines 3 and 4, we have to check if the values in the tuple are present in the 
domains of the corresponding variables. If one of these values has been deleted 
then the tuple is not valid. 



3.2 AC on the Hidden Variable Encoding 

As discussed, the worst-case cost of AG on the hidden variable encoding, mea- 
sured in consistency checks, is the same as GAG on the non-binary representa- 

^ In fact, |T|} subtuples, where |T| is the number of allowed tuples in the 

constraint. See I6I14I for details. 
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function Propagation 

While Q is not empty 
pick c from Q 

for each uninstantiated Xi £ c 
if Revise{xi, c) = TRUE then 

if domain of Xi is empty then return INCONSISTENCY 

1 put in Q all constraints that involve Xi 
Return CONSISTENCY 

function Revise{xi, c) 

DELETION ^ FALSE 

for each value a in the domain of Xi 

2 if currentSupportxi,a,c is not valid then 

3 if 3 t{£ c) > currentSupportxi,a,c r includes (xi,a) and r is valid 

then currentSupportxi,a,c <— r 

4 else remove a from the domain of Xi 
DELETION ^ TRUE 

Return DELETION 

Fig. 2. The algorithm of mM for non-binary CSPs. 



tion. When GAC-4 and its equivalent in the encoding are used, we can also get 
exactly the same number of primitive operations. We now analyze the difference 
between the extended GAC-3 algorithm and its equivalent on the encoding. To 
get the hidden variable equivalent of the GAG-3 algorithm shown in Figure[^we 
need to make 3 changes. First, any references to constraints are substituted by 
references to hidden variables. For example, line 1 in Figure l^will read: “put in 
Q all hidden variables that involve Xi \ Second, after a value is removed from 
the domain of an original variable (line 4), all tuples that include that value are 
removed from the domains of the corresponding hidden variables. Third, check- 
ing if a tuple is valid is done in a different way than in the non-binary case. If 
a tuple is not valid then one of its values has been removed from the domain of 
the corresponding variable. This means that the tuple has also be removed from 
the domain of the hidden variable. Therefore, to check the validity of a tuple we 
only need to look in the domain of the hidden variable and check if the tuple is 
present. 

We will now show that the GAG algorithm of Figure Hand its corresponding 
AG algorithm on the encoding will perform the same number of consistency 
checks when applied on a problem that is GAG. Gonsider that if no domain 
wipeout in any variable (original or hidden) occurs then the two algorithms will 
add constraints (hidden variables) to the stack and remove them for revision 
in exactly the same order. The difference is that the binary version will revise 
domains of hidden variables as an extra step. However, this does not involve any 
consistency checks. Therefore, we only need to show that if a value is deleted 
from a variable during the revision of a constraint or finds a new support in the 
constraint then these operations will require the same number of checks in both 
representations. Assume that in the non-binary version of the algorithm value a 
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is deleted from variable x because it has no support in constraint c. If |T| is the 
number of allowed tuples in c then this will require |T| — currentSupportx,a,c 
checks, one for each of the tuples in c that have not been checked yet. If the 
value is not deleted but finds a new support t, with r > currentSupportx,a,c 
then r — currentSupportx^a,c checks will be performed. In the hidden variable 
encoding, x will be processed in the same order as in the non-binary version and 
we will require |T| — currentSupportx,a,h^ or r — currentSupportx,a,h^ checks 
depending on the case, he represents the hidden variable corresponding to c. 
Obviously, both supports are the same, since a tuple in c corresponds to a value 
in he, and the same number of checks will be performed in both representations. 

On the other hand, on a problem that is not GAG, the AG algorithm on the 
encoding can perform less checks than the GAG algorithm. Gonsider a problem 
that includes variables x\, x^, xz, a^4 with domains {0, 1}, {0, 1}, {0, . . . , 9}, and 
{0, 1}, respectively. There are two constraints, c and c', over variables {x\,X 2 , xz) 
and {xi,X 2 , X 4 ) respectively. Value 0 of X 2 is supported in c by tuples that include 
the variable- value pair {x\, 1). Value 0 of x\ is supported in d by tuples that 
include the variable- value pair {x 2 , 0). Values 0, . . . , 9 of X3 are supported in c 
by tuples that include (x2,0) and by tuples that include (a;2 ,l)- Assume that 
variable x\ is instantiated to 0, which means that the deletion of 1 from x\ 
must be propagated. In the encoding, we will first delete all tuples that include 
the value (xi, 1) from hidden variables he and he'- Then, we revise all original 
variables connected to hidden variables he and he' ■ Assuming that he is processed 
first, value 0 of X 2 will have no support in he so it will be deleted. As a result, we 
will delete all tuples from hidden variable he' that include the pair (x2,0). This 
means that the domain of he' will be wiped out. In the non-binary representation, 
after the deletion of 0 from X 2 , we will find that value 1 of X 2 and all values of 
xz have supports in c. This will involve checks that are avoided in the encoding. 
The inconsistency will be discovered when we process constraint d and find out 
that value 1 of X 2 has no support in d resulting in the domain wipeout of X 2 - 

We have demonstrated that AG in the hidden variable encoding can detect 
an inconsistency with fewer checks than GAG in the non-binary representation, 
while on graphs that are AG both algorithms will perform the same checks. This 
does not mean that algorithms on the encoding will always be more efficient in 
run times because the run time of an algorithm depends on the total number 
of primitive operations it will perform. There is a trade-off in the operations 
that the GAG algorithm performs in the non-binary version compared to the 
binary one. Assuming there are kp past (instantiated) and kf future variables 
in a constraint with |T| allowed tuples then the binary GAG-3 algorithm will, 
in the worst case, perform 0{kfdd^) checks -I- 0(|T|) updates in the domain 
of the hidden variable, when applied on the encoding. That is, the worst-case 
complexity in the number of primitive operations is 0{kfd^f + |T|). The non- 
binary GAG-3 will perform 0{kkfdd ^ ) operations in the worst case. That is, for 
every check, the algorithm will have to make 0 {k) domain checks to make sure 
that the checked tuple is valid. 
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4 Search Algorithms 

Like GAC algorithms, non-binary search algorithms can be simulated by equiv- 
alent algorithms that run on the hidden variable encoding. For example, it has 
been shown that the MGAC algorithm on a non-binary GSP is equivalent to 
MAG on the hidden variable encoding of the GSP when only original variables 
are instantiated and similar branching heuristics are used m- We now show 
that similar results hold for generalized versions of forward checking (FG). 

According to the simplest generalization of FG, forward checking is performed 
only after k-1 variables of an k-ary constraint have been instantiated. This algo- 
rithms is called nFGO in j^. More, and stronger, generalizations of FG to non- 
binary constraints were introduced in [^. These generalizations differ between 
them in the extent of look-ahead they perform after each variable instantiation. 
For example, algorithm nFG5, which is the strongest version, tries to make the 
set of constraints involving at least one past variable and at least one future vari- 
able GAG. All the generalizations reduce to simple FG when applied to binary 
constraints. 

Here we will show that the various versions of nFG are equivalent, in terms 
of visited nodes, to binary versions of FG that run on the hidden variable en- 
coding of the problem. As mentioned, this holds under the assumption that the 
binary algorithms only instantiate original variables and they use similar branch- 
ing heuristics as their non-binary counterparts. We call these binary algorithms 
hFG0-hFG5. Each binary algorithm performs the same amount of propagation 
as the corresponding non-binary algorithm. For example, hFG5 will enforce AG 
on the set of hidden variables, and original variables connected to them, such 
that each hidden variable is connected to at least one past original variable 
and at least one future original variable. The equivalence between nFGl and an 
algorithm called FG-I- in has already been proven in |3]. 

Proposition 1. In any non-hinary CSP, algorithms nFC0-nFC5 are equivalent 
to binary forward checking algorithms hFC0-hFC5 that operate on the hidden 
variable encoding of the problem resulting in the same node visits. 

Proof. We prove this for nFG5, the strongest among the generalized FG algo- 
rithms. Proofs for the other versions are similar. We only need to prove that at 
each node of the search tree algorithms nFG5 and hFG5 will delete exactly the 
same values from original variables. Assume that at some node, after instantiat- 
ing the current variable, nFG5 deletes value a from a future variable x because 
it found no support in a constraint c that has at least one instantiated variable. 
hFG5 will also delete this value from x because it will find no consistent tuple 
in the corresponding hidden variable he. This is due to the fact that the current 
domain of he will contain only valid tuples with respect to the current variable 
domains of the original variables, since inconsistent ones will have been deleted 
either in a previous run of AG, or after the instantiation of the current variable 
(recall that he contains at least one instantiated variable). Now in the opposite 
case, if hFG5 deletes value a from an original variable x it means that all tuples 
including that assignment are not present in the domains of a hidden variable 
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he that include x and at least one past variable. In other words, there is no 
consistent tuple in c, with respect to the current variable domains, that contains 
the assignment x = a. As a result, nFC5 will remove a from the domain of x. 

□ 

Therefore, if we never instantiate hidden variables in the binary representation 
and apply algorithms hFC0-hFC5 we will end up with the same node visits as the 
respective nFC0-nFC5 algorithms in the non-binary representation. Note that 
in [2 experimental results show differences between FC on the hidden variable 
encoding and non-binary FC. However, the algorithms compared there were 
FC-|- and nFCO which are not equivalent. We have also experimented with a 
stronger version of hFC5, which we call hFC5b, that visits fewer nodes than 
nFC5 and hFC5 but may perform more operations at each node. hFC5b is a FC 
algorithm that operates exactly like hFC5 in that no original variable involved 
in constraints that contain only future variables is revised. If however a value 
is deleted from some future variable x because of a constraint between x and 
past variables then all hidden variables connected to x are revised, including 
hidden variables that are only connected to future originals. Observe that there 
is no equivalent to hFC5b that applies on the non-binary representation. In 
general, the hidden variable encoding is a flexible representation that allows 
for the definition of algorithms that maintain more refined consistency levels 
depending on which hidden variables are updated. 

5 Instantiating Hidden Variables 

So far we have shown that solving an extensionally defined CSP by using the 
non-binary representation is in many ways equivalent to solving it using the 
hidden variable encoding, assuming that only original variables are instantiated. 
A natural question is whether search techniques which are inapplicable in the 
non-binary case can be applied on the encoding. The answer is the ability of a 
search algorithm that operates on the encoding to select and instantiate hidden 
variables. In the equivalent non-binary representation this would imply instan- 
tiating values of variables simultaneously. To implement such an algorithm we 
would have to modify standard search algorithms and heuristics or devise new 
ones. On the other hand, in the hidden variable encoding an algorithm that in- 
stantiates hidden variables can be easily implemented using a standard search 
algorithm and branching heuristic. Note, that if we only instantiate original vari- 
ables then the hidden variables will be instantiated implicitly. That is, when all 
the original variables connected to a hidden are instantiated then the domain 
of the hidden variable is reduced to a singleton (i.e., it is instantiated). As the 
next section shows, by instantiating hidden variables in the encoding we can also 
achieve higher levels of consistency than in the non-binary representation. 

5.1 Singleton Consistencies 

We know that enforcing AC in the hidden variable encoding is equivalent to 
enforcing GAC in the original problem. Here we prove that when we move up to 
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the consistency level of SAC then enforcing it on the hidden variable encoding is 
strictly stronger than enforcing SGAC on the original problem. This is derived 
from the ability of SAC to istantiate hidden variables and check their consistency. 
We denote by Pr>i={a} the CSP obtained by restricting the domain of variable 
Xi to {a} in a CSP P. 

Proposition 2. Achieving singleton arc consistency on the hidden variable en- 
coding of a non-binary problem is strictly stronger than achieving singleton gen- 
eralized arc consistency on the variables in the original problem. 

Proof. We have to prove that if a value a of a variable Xi in a CSP P is not 
SGAC then SAC on the encoding of P will prune that value. From m we know 
that if a value b of variable Xj is not GAC in P\u.^[a} then it is also arc in- 
consistent in the encoding of P\oi={a}- For SGAC to remove value a, all values 
in a variable xj must be deleted when a is assigned to Xi. According to the 
above, all such values will also be deleted from the domain of Xi in the hidden 
variable encoding of P\Di={a}- Therefore, value a will be singleton arc incon- 
sistent in the hidden variable encoding. To show strictness, consider a problem 
with five variables {a;i, a;2, 2:5}, all of them with domain {0,1}, and the 
following ternary constraints: A constraint over {xi,X2,xs} with allowed tuples 
{< 0, 0, 1 >, < 0, 1,0 >, < 1, 0, 0 >, < 1, 1, 1 >}, a constraint over {xi,X2,X4} 
with allowed tuples {< 0, 0, 1 >, < 0, 1,0 >, < 1, 0, 0 >, < 1, 1, 1 >}, and a con- 
straint over {xi,X2^ 2:5} with allowed tuples {< 0 , 1 , 0 >, < 1 , 0, 1 >}, Enforcing 
SGAC on this problem will make no deletions. However, enforcing SAC on the 
encoding will show that the problem is insoluble. If we take the hidden vari- 
able hi corresponding to the constraint over {cci, X2, 2:3}, for example, enforcing 
SAC will delete all the tuples from its domain because they are all singleton arc 
inconsistent. 

□ 

In [12] it is proved that all consistency levels between SAC and AC (e.g. 
path inverse consistency and restricted path consistency) collapse onto AC, in 
the hidden variable encoding. Also, neighborhood inverse consistency, which is 
incomparable to SAC collapses onto AC. Therefore, the weakest consistency level 
where we notice a gap between the amount of pruning achieved in the hidden 
encoding and the non-binary representation is SAC. In fact, to get the pruning 
achieved by SAC in the encoding we only need to consider the hidden variables. 
For example, if all tuples in a hidden variable that include the variable-value 
pair (x,a) are removed by SAC then so will the value a from x. However, the 
extra pruning achieved in the encoding incurs extra cost because of the (usually) 
large domain sizes of the hidden variables. If we restrict SAC on encoding to the 
original variables only then we get the same level of consistency as SGAC in the 
original problem. The proof is easy and is omitted due to space restrictions. 

6 Experimental Results 

In this section we study empirically the efficiency of algorithms that run on 
the hidden variable encoding compared to their non-binary counterparts. For 
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the empirical investigation we use randomly generated problems and benchmark 
crossword puzzle generation problems. Both of these classes are naturally defined 
by an extensional representation of the constraints. In the case of crossword 
puzzles the constraints are by nature very tight. In the case of random problems 
we also focus our attention on tight instances. The reason being that the binary 
encoding can only be practical if the constraints are tight enough so that the 
domains of the hidden variables are not prohibitively large. 

6.1 Random Problems 

Random problems were generated using the extended model B as in |3]. Under 
this model, a random CSP is defined by five parameters < n, d, k,p, q >, where 
n is the number of variables, d the domain size, k the arity of the constraints, 
p the density of the generated graph, and q the looseness of the constraints, p 
and q are given as a % percentage of the constrained variable combinations and 
allowed tuples in these constraints, respectively. In this empirical comparison we 
included the following algorithms: MGAC, MHAC, which stands for MAC in 
the encoding that only instantiates original variables, nFC5, hFC5, and hFCSb. 
hFC5 and hFCSb also instantiate only original variables. All algorithms use the 
dom/deg heuristic for variable ordering |3] and lexicographic value ordering. The 
CAC and AC algorithms used are the ones described in Sections [XU and [X21 We 
chose to use these algorithms because they have a good asymptotic complexity 
and they are easy to implement. We do not include results on algorithms that can 
instantiate hidden variables as well as original ones because experiments showed 
that such algorithms have very similar behavior to the corresponding algorithms 
that instantiate only original variables. The reason is that, because of the nature 
of the constraints, the dom/deg heuristic almost always selects original variables. 
In the rare cases where the heuristic selected hidden variables, this resulted in 
an increase in node visits. Table [T] shows the performance of the algorithms on 
four classes of randomly generated ternary CSPs. All classes are from the hard 
phase transition region. Classes 1 and 2 are sparse, 3 is very sparse, and 4 is 
again relatively sparse but denser than the others. We report node visits, CPU 
times, and consistency checks. A consistency check consists of two operations. 

1) Checking if a tuple r includes the value for which we search for support, and 

2) checking if r is valid. 

From Table H] we can see that algorithms that operate on the encoding and 
instantiate only original variables perform fewer checks in all classes than the 
corresponding non-binary algorithms. This is due to their ability of early do- 
main wipeout detection at dead ends. CPU times are influenced not only by the 
number of checks but by the total number of primitive operations performed. 
We can see that MHAC performs better than MCAC on the sparser problems. 
However, the differences in classes 1 and 2 are marginal. In general, for all the 
3-ary classes we tried with density less than 3% — 4% the relative performance 
of MHAC and MCAC (in run times) ranged from being equal to a 40% advan- 
tage for MHAC. The differences are more notable on the very sparse class 3. 
This is due to the fact that for sparse problems the hard region is located at 
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Table 1. Comparison of algorithms on sparse random classes. Classes 1 and 2 taken 
from [3]. CPU times are in seconds. For nodes and checks we give mean numbers for 
50 instances at each class. “K” implies x 10^ and “M” implies x 10® 





nFC5 


hFC5 


hFC5b 


MGAC 


MHAC 


class 1 


n = 30, d = 6, k = 3, 


p = 1.847, g = 50 


nodes 


4645 


4645 


4150 


3430 


3430 


sec 


1.47 


1.65 


1.90 


2.08 


1.90 


checks 


13M 


IIM 


lOM 


20M 


14M 


class 2 


: n = 75, d = 5, fc = 3, 


p = 0.177, q^41 


nodes 


21976 


21976 


16723 


7501 


7501 


sec 


5.67 


6.90 


5.63 


4.09 


3.41 


checks 


17M 


16M 


12M 


24M 


15M 


class 3: 


n = 50, d = 10, A: = 5, 


p = 0.001, q = 0.5 


nodes 


21283 


21283 


20260 


16496 


16496 


sec 


58.56 


22.25 


27.73 


74.72 


22.53 


checks 


783M 


643M 


631M 


847M 


628M 


class 4: n = 


20, d = 


10, k = 


3,p = 5, 


g = 40 


nodes 


5400 


5400 


5124 


4834 


4834 


sec 


4.19 


5.19 


7.78 


5.75 


8.15 


checks 


119M 


99M 


95M 


151M 


119M 



low constraint tightnesses (i.e., small domains for hidden variables) where only 
a few operations are required for the revision of hidden variables. Another factor 
contributing to the dominance of the binary algorithms in class 5 is the arity of 
the constraints. The non-binary algorithms require more operations to check the 
validity of tuples when the tuples are of large arity, as explained in Section 13.11 
When the density of the graph increases (class 4), the overhead of revising the 
large domains of hidden variables and restoring them after failed instantiations 
slows down the binary algorithms, and as a result they are outperformed by the 
non-binary ones. For denser classes than the ones reported, the phase transition 
region is at a point where more than half of the tuples are allowed, and in such 
cases the non-binary algorithms perform even better. 

6.2 Crossword Puzzles 

Crossword puzzle generation problems have been used for the evaluation of 
search heuristics for CSPs 0a and binary encodings of non-binary problems 
mu. Tables [2] and El show the performance of the tested algorithms for var- 
ious crossword puzzles in running time and number of visited nodes. We used 
selected hard puzzles from jS] and 20 15x15 and 19x19 puzzles from [2]- Apart 
from algorithms that instantiate only original variable we tested versions of hFC5 
and MAC which may also instantiate hidden variables. We call these algorithms 
hidFC5, hidFC5b, and hidMAC. Again, all algorithms use the dom/deg heuris- 
tic for variable ordering. An em-dash ( — ) is placed wherever some method did 
not manage to find a solution within 5 hours of cpu-time. n is the number of 
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words and m is the number of blanks in each puzzle. Problems marked by (*) 
are insoluble. 

We used the Unix dictionary for the allowed words in the puzzles. Four puz- 
zles (15.06, 15.10, 19.03, 19.04) could not be solved by any of the algorithms 
within 5 hours of cpu time. Also two puzzles (19.05 and 19.10) were arc in- 
consistent. GAG discovered the inconsistency slower than HAG in both cases 
(around 3:1 time difference in 19.05 and 10:1 in 19.10) because the latter method 
discovered early the domain wipe-out of a hidden variable. 

At the rest of the puzzles we can observe that MHAG usually performs better 
than MGAG on the hard instances. For the hard insoluble puzzles the difference 
is considerable, and so is the difference between hFG5 and nFG5. This is mainly 
due to the uniformly large arity of the constraints in these classes]^ Another 
interesting observation is that there can be large differences between the perfor- 
mance of methods that instantiate hidden variables and those which instantiate 
only original ones. In many cases hidMAG managed to find a (different) solu- 
tion than MHAG and MGAG earlier. This shows that we can benefit from a 
method that instantiates hidden variables. In puzzle 19.08 hidMAG managed to 
find a solution fast, while the other MAG algorithms thrashed. Note, that the 
FG algorithms also found a solution quickly, which means that in this case the 
propagation of MGAG and MHAG misguided the variable ordering heuristic. 
On the other hand, the hid* methods were also subject to thrashing in instances 
where other methods terminate. The fact that in all insoluble puzzles hidMAG 
did not do better than MHAG shows that its performance is largely due to the 
variable ordering scheme. When comparing MAG methods with equivalent FG5 
ones, we see that in most cases maintaining full consistency is better for this class 
of problems. Also, the hFG5b and hidFG5b algorithms do not always pay-off. 

Regarding node visits, observe that in many cases hidden variable instantia- 
tion methods visit less nodes than their original variable counterparts, but this 
does not reflect to the same time performance difference because when a hid- 
den variable is instantiated hidMAG does more work than when an original one 
is. It has to instantiate automatically all original variables involved in the hid- 
den and propagate these changes to all other hidden variables containing them. 
Note, that constraints in crosswords are much tighter than the constraints in 
random problems. For example, the tightness of a 6-ary constraint in a puzzle is 
99,999988%. This is why the hid* methods can perform well on such problems. 
Gonsistent problems with such high tightnesses cannot be generated randomly. 

In general, we believe that if we exploit better the potential of instantiating 
hidden variables (i.e., by a suitable variable ordering heuristic), methods that 
instantiate hidden variables can go down the search tree faster than ones that 
consider only original variables, because they can benefit from small hidden 
variable domains. Notice that hidMAG reduces to MHAG if it instantiates only 
original variables. Therefore, if employed with the optimal variable ordering it 
can never be worse than MHAG. We are currently working towards devising such 
ordering heuristics. 

Puzzles 6x6-10x10 correspond to square grids with no blank squares. 
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Table 2. Comparison (in cpu time) of algorithms on crossword puzzles. All times are 
in seconds except those followed by “m” (minutes). 



puzzle 


n 


m 


MGAC 


MHAC 


hidMAC 


nFC5 


hFC5 


hidFC5 


hFC5b 


hidFC5b 


15.01 


78 


189 


8.5 


7.9 


4.4 


11.5 


15.4 


5.3 


10.1 


4.2 


15.02 


80 


191 


24.5 


26.9 


— 


77.8 


138.7 


— 


61.1 


— 


15.03 


78 


189 


4.2 


4.6 


2.3 


21.2 


30.6 


2.3 


30.9 


2.81 


15.04* 


76 


193 


290 


295 


218 


24.5 


29.8 


979 


243 


791 


15.05 


78 


181 


3 


3.1 


2.2 


3.7 


3.8 


3.3 


4.8 


2.5 


15.07 


74 


193 


670 


335 


376m 


48.3 


39.4 


482m 


465m 


367m 


15.08 


84 


186 


2.32 


2.27 


2.89 


3.22 


3.37 


3.52 


3.27 


3.1 


15.09 


82 


187 


2.24 


2.3 


2.45 


1.92 


1.81 


— 


2.43 


— 


19.01 


128 


301 


7.6 


7.3 


6.9 


— 


— 


4.56 


— 


4.8 


19.02 


118 


296 


198 


204 


— 


— 


— 


— 


495 


— 


19.06 


128 


287 


5.9 


4.7 


5.8 


4.1 


4.9 


4.6 


5 


— 


19.07 


134 


291 


3.4 


3.4 


4.4 


4.1 


4.1 


5.2 


3.8 


5.2 


19.08 


130 


295 


— 


— 


5.45 


4 


3.3 


4.7 


3.6 


4.7 


19.09 


130 


295 


3.64 


5 


4.2 


6.2 


6.7 


4.6 


4.8 


4.8 


puzzleC 


78 


189 


77.5 


107 


— 


153 


209 


— 


115 


— 


6x6 


12 


36 


84 


55 


64 


109 


75 


104 


73 


79 


7x7* 


14 


49 


120m 


75m 


96m 


176m 


107m 


159m 


120m 


148m 


8x8* 


16 


64 


45m 


29m 


42m 


58m 


32m 


57m 


35m 


59 


9x9* 


18 


81 


488 


337 


454 


868 


470 


737 


614 


797 


10x10* 


20 


100 


117.7 


77 


93 


534 


331 


363 


192 


217 



Table 3. Comparison (in node visits) of algorithms on crossword puzzles. MGAC and 
MHAC visit the same number of nodes and this holds also for nFC5 and hFC5. 



puzzle 


n 


m 


MGAC, MHAC 


hidMAC 


nFC5,hFC5 


hidFC5 


hFC5b 


hidFC5b 


15.01 


78 


189 


574 


200 


1607 


398 


1067 


295 


15.02 


80 


191 


1312 


— 


15559 


— 


6029 


— 


15.03 


78 


189 


338 


126 


4105 


159 


3364 


183 


15.04* 


76 


193 


19667 


18479 


2869 


75450 


25202 


63985 


15.05 


78 


181 


286 


145 


528 


248 


459 


189 


15.07 


74 


193 


12733 


568768 


4180 


1504450 


2700150 


744180 


15.08 


84 


186 


247 


165 


362 


277 


294 


187 


15.09 


82 


187 


251 


155 


247 


— 


287 


— 


19.01 


128 


301 


469 


309 


— 


224 


— 


202 


19.02 


118 


296 


15764 


— 


— 


— 


33079 


— 


19.06 


128 


287 


375 


158 


357 


200 


346 


— 


19.07 


134 


291 


305 


206 


344 


240 


306 


222 


19.08 


130 


295 


— 


191 


332 


249 


322 


218 


19.09 


130 


295 


308 


167 


458 


199 


347 


171 


puzzleC 


78 


189 


9827 


— 


26315 


— 


11820 


— 


6x6 


12 


36 


2263 


2097 


7332 


5735 


5028 


4259 


7x7* 


14 


49 


116082 


138199 


634858 


455716 


396791 


303330 


8x8* 


16 


64 


31386 


40037 


231950 


163527 


108338 


78076 


9x9* 


18 


81 


4972 


5715 


71020 


35736 


23279 


14344 


10x10* 


20 


100 


1027 


1120 


35492 


18922 


13105 


10438 



7 Conclusion 

In this paper, we performed a theoretical and empirical investigation of arc con- 
sistency and search algorithms for the hidden variable encoding of non-binary 
CSPs. We analyzed the potential benefits of using AC algorithms on the hidden 
encoding compared to GAC algorithms on the non-binary representation. We 
showed that FC algorithms for non-binary constraints can be emulated by cor- 
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responding binary algorithms that operate on the hidden variable encoding and 
only instantiate original variables. Empirical results on various implementations 
of search algorithms showed that the hidden variable is competitive and in many 
cases better than the non-binary representation for tight classes of non-binary 
constraints. A general conclusion from this study is that there is an interest- 
ing mapping between algorithms for non-binary constraints and corresponding 
algorithms for binary encodings, even in refined levels of implementation. For 
future work we plan to develop variable ordering heuristics more suitable to the 
hidden encoding. Also, we intend to investigate how lessons learned from this 
study apply to other GAC algorithms, like GAC-schema. 
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Abstract. This paper describes a filtering algorithm for a type of con- 
straint that often arises in rostering problems but that also has wider 
application. Defined on a sequence of variables, the stretch constraint 
restricts the number of consecutive identical values in the sequence. The 
algorithm mainly proceeds by determining intervals in which a given 
stretch must lie and then reasoning about them to filter out values. It is 
shown to have low time complexity and significant pruning capability as 
evidenced by experimental results. 



Introduction 

A number of global constraints introduced in the constraint programming litera- 
ture have successfully encapsulated powerful filtering algorithms, often inspired 
from existing ones, while remaining sufficiently generic to ensure wide appli- 
cability (e.g. mmm- This paper proposes another such constraint that often 
arises in rostering problems, for example. Defined on a sequence of variables, 
the stretch constraint specifies lower and upper limits on the number of con- 
secutive identical values in that sequence. These limits may also depend on the 
value. The filtering algorithm mainly proceeds by determining intervals in which 
a given stretch must lie and then reasoning about them to filter out values. 

The rest of the paper is organized as follows. The next section briefly de- 
scribes the usual context in which the constraint is found. Section El presents 
a formulation of the constraint while section O explains the filtering algorithm 
which is used to enforce it. Some experimental results are then reported in sec- 
tion 2] to assess the algorithm’s efficiency. Section |5] discusses some consistency 
issues. Finally, section Elpresents concluding remarks on the applicability of such 
a constraint. 



1 Rostering 

Many industries and public services operate around the clock, seven days a week. 
In such a context, every day a number of work shifts must be covered by one or 
a team of workers. A workload requirement matrix is usually given or computed 

T. Walsh (Ed.): CP 2001, LNCS 2239, pp. 183-[l95] 2001. 
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which specifies the number of (teams of) workers required for each type of shift 
of every day, either as a precise value or an interval. For shift work, it is often 
preferable to schedule work stretches of the same type of shift for individuals: it 
is easier on their internal body clock as well as on their family and social life [4] . 

Accordingly, restrictions on the length of a work stretch are usually given 
as part of the formulation of the problem. Work stretches will typically be con- 
strained to span at least one or two shifts and at most six or seven. Such a 
restriction will rarely vary between shift types except maybe for days off. Per- 
mitted patterns of shift types are also usually given. As a special case of this, a 
common restriction is that between two consecutive work stretches, some mini- 
mum number of days off should be given. Other constraints are present too that 
do not concern us here. 



1.1 Rotating Schedules 

When the personnel is interchangeable, rotating schedules, a repeating pattern of 
sequences of work and rest days alternating over several weeks, are particularly 
well adapted. A schedule is given over a cycle of w weeks and the workforce 
is divided into w teams: initially the first team follows the schedule of week 
1, the second one the schedule of week 2, and so forth. After the seventh day, 
each moves to the next week, the team on week w moving up to week 1. In 
effect, everyone has an identical schedule but that is out of phase with the other 
teams. This ensures that everybody is treated equally. An example of a rotating 
schedule is given in table [U 



Table 1. A simple rotating schedule. Symbols “D” , “E” , and “N” indicate day, evening, 
and night shifts respectively whereas indicates a day off. 



Week 


Mon Tue Wed Thu Fri Sat Sun 


1 


- 


- 


- 


D 


D 


D 


D 


2 


- 


- 


E 


E 


E 


- 


- 


3 


D 


D 


D 


- 


- 


E 


E 


4 


E 


E 


- 


- 


N 


N 


N 


5 


N 


N 


N 


N 


- 


- 


- 



That example features stretches of length four and three for day shifts; three 
and four for evening shifts; seven for night shifts; two, two, two, two, and six for 
days off. 

1.2 Personalized Schedules 

When members of the personnel have individual restrictions or preferences that 
must be taken into consideration, such as unavailabilities due to other activi- 
ties, rotating schedules become inappropriate. Personalized schedules for each 
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member of personnel are then elaborated. This is typical of some category of 
personnel such as physicians. 

Instead of a cyclic schedule as before, many individual rosters need to be 
designed, spanning a given scheduling horizon. Constraints on the length of work 
stretches are still relevant in this context and may vary from one individual to 
another. 



2 The Stretch Constraint 

Let work shifts be numbered consecutively from 0 to n— 10 Consider a constraint 
programming model for the rostering problem in which a sequence of decision 
variables sq, si, . . . , Sn_i stand for consecutive work shifts either representing 
the whole roster, in the case of rotating schedules, or one individual roster, in 
the case of personalized schedules. In the rest of the paper, we will take the 
point of view of rotating schedules, for which the sequence of shifts is cyclic 
— consequently indices will be computed modulo n. Let T>si C T denote the 
domain of s^, where T = {ti,T 2 , . . . ,Tm}, the set of shift types , including one 
corresponding to a day off. 

Definition 1. Subsequence Si, . ,Sj is called a stretch when Si = 

®(i+i) mod n ~ ~ mod n ®(j+i) mod ni^ span of 

a stretch from indices i to j, denoted span(i, j), is defined as 1 + {j — i) mod n. 



Definition 2. We call pattern two contiguous work stretches of different types 
(e.g. Ti, n, Ti, T 2 , T 2 denoted T 1 T 2 ). 

As indicated before, instances of rostering problems often restrict which pat- 
terns may appear in a schedule. One sometimes meets slightly more complex 
prescribed arrangements of two work stretches of given types separated by a 
stretch of rest shifts (e.g. ri, ri, Tm, T 2 ). Though more elaborate patterns are 
conceivable, the previous two cases are sufficiently expressive for all real-life 
rostering instances encountered by the author so far. 

Let A and A be integer vectors of length m, 77 be a set of patterns, and 7 be 
a boolean value. The stretch constraint may then be formulated as 

stretch((so, si, . . . , s„_i), A, A, 77, 7 ) 



with the following semantics: 

V0<i<n — 1, the span of the stretch through Si lies between A^. and A^, . 

As A and A respectively represent minimum and maximum lengths for 
stretches, the constraint is only well-defined when As, < Xk V7. Set 77 repre- 
sents the permitted patterns for the sequence — they are used to refine the 

^ We choose to start with 0 in order to simplify subsequent modulo expressions. 
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filtering but are not enforced by this constraint. The value true for 7 indicates 
a cyclic schedule where s„_i’s successor in the sequence is sq! the value false 
indicates a sequence with no wrap around. 

A similar constraint is the global sequencing constraint [Bj. Defined on a se- 
quence of variables as well, it is used to specify minimum and maximum numbers 
of appearances of each value within every subsequence of a given length. The 
main difference is that these values do not have to appear consecutively (i.e. in 
a stretch). 



3 Its Filtering Algorithm 

This section is devoted to the detailed description of the algorithm enforcing the 
semantics of stretch. 



3.1 Determining Bounds on a Stretch 

All the filterings described here are based on information about the possible 
beginning and end of a particular stretch. For any given shift Si taking value 
Tk, we wish to compute the tightest intervals [/3min , /dmax] and [cmin > Cmax] in 
which the beginning and the end of the stretch through that shift must lie, 
respectively, given the current domains of the Sj's. Figure [1] provides an example. 
The extremal values of the intervals are derived using the algorithms given below. 

/dmax- 



1 . j <— (i — 1) mod n; 

2. while Sj = Tk do 

J ^ (j ~ 1) niod n; 

3. /3max ^ 0' + 1) mod n; 



This first algorithm simply scans the shift variables backwards from index i 
until it reaches one that is not currently instantiated to r^. /3max is then set to 
the index following that (see figure |2]) . We do not reproduce the algorithm for 
Emin here as it is simply the mirror image of the previous one. 



i 











'Ck 


'Ck 


^k 






[ Pmin 






Pmax 




Emin Ejnax] 



Fig. 1. Bounding a stretch of rt’s on a sample fragment of a schedule. 
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Xk 


Xk 


Xk 



Pmax 



Fig. 2. Determining /3max on a sample fragment of a schedule. 



'min ■ 



1. if Afe > n then tooJar ^ (cmin + 1) mod n; 

2. else too_far ^ (cmin — Afc) mod n; 

3. j ^ (/3max - 1) mod n; 

4. done ^ false; 

5. while j ^ too-far and not done do 

a. while e and \T>s.\ > 1 and j ^tooJar do 

i- j ^ (j - 1) mod n; 

b. if Tk ^ T>sj then 

1* Pmin ^ find-frontier(j, fc,/3max); 
ii. done <— true; 

c. else if \T>s^ \ = 1 then 

i- f ^ j; 

ii. while sj = Tk and j ^ tooJar do 

j ^ (j - 1) mod n; 

iii. if Sj = Tk then 

Pmin ^ find-frontier ((j' -h 1) mod n, fc, /3max); 
done ^ true; 

6. if not done then /3min ^ find-frontier(j, A:, /Jmax); 



The computation of Pmin is more involved. In order to avoid unnecessary work 
a threshold value, too_far, is determined beyond which /3min cannot possibly lie. 
Step 1 deals with a special case whereas step 2 corresponds to the general case: 
the threshold is equal to the earliest position at which the stretch may end minus 
the longest it may run. In step 3, j is set to the position immediately preceding 
/3max- Step 5 iterates until either /3„iin is determined or the threshold is reached. 
First, the shift variables are scanned backwards from j as long as the current 
variable has several values in its domain, including Tk, and the threshold has 
not been reached (5a). If the scan was stopped because Tk does not belong to 
the domain (5b), then /3min cannot lie beyond this point but its proper value is 
determined by the call to find-frontier, which we shall describe shortly, and then 
we are done (see figure O. Failing that, if the scan was stopped because the shift 
variable is bound (5c), then it must be that Sj = Tk- If position j is included in 
the stretch then so must its immediate predecessor if it is also bound to Tk, and 
so on, since a stretch clearly cannot be flanked by shifts of the very same type. 
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Pr 



Pr 



Fig. 3. Step 5b in determining /3min on a sample fragment of a schedule. 



too_far 





'Ck 1 




'Ck 








'Ck 


1 

1 

(a) 

too_far 




Pmax 


Pmin 






'Ck 










'Ck 



Pmax 



Fig. 4. Step 5c in determining /3min on a sample fragment of a schedule. 



from its definition. So, the shift variables are scanned backwards until one that is 
not currently instantiated to Tk is met or the threshold is reached (5cii). If at the 
threshold while still being instantiated to Tk (5ciii), the stretch would be too long 
and so /3min must lie to the right of that run of bound variables, as determined 
by the call to find-frontier, and we are done (see figure m- Otherwise a new 
iteration of step 5 is begun (see figure |4 ]d). Finally, if the threshold is reached 
without encountering any limitation to /3min, find-frontier is called in order to 
determine Pmin more precisely in light of what lies just beyond that threshold. 

Just as for Cmin) the computation of Cmax is not described since it proceeds 
similarly. 

find-frontier {j, k, p) : 



1. while j p 

a. for each T( G T>s,j such that TiTk G II 

i. found <— true; 

ii. for z = 1 to — 1 

if Ti ^ I^s(j_i),nad n. then found ^ false; 

iii. if found then return (j + 1) mod n; 
t>- j ^ {j + 1) iiiod n; 

2. return —1; 
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find-frontier ensures that on the immediate left of /3min, enough variables 
have a common value in their domains to allow a valid neighbouring stretch. 
Step 1 scans the variables forward from j until a value ti is found that lies in the 
domains of Sj, Sj-i , . . . , and that forms a permitted pattern of shift types 

with Tfe, at which time value (j + 1) mod n is returned. As an example, consider 
the fragment in figure |4^ with Xp = 2: /3max ~ 1 will be returned. In step 2, if p is 
reached without any such value being found then —1 is returned. In the context 
of the computation of /3min) the latter means that interval [/3min) /3max] is empty, 
triggering one of the filtering rules in section I, 8., 8 1 

The following confirms that our later reasoning based on those intervals will 
be sound: 

Theorem 1. Given the current domains of the sequence of variables, inter- 
vals [/3min, /dmax] o,nd [emin,£max] Computed abovc must respectively include 
the starting and the ending indices of the stretch through Si. 

Proof. Clearly, the starting index cannot be larger than /?niax since by construc- 
tion of the latter this would mean that the stretch has a shift of the same type 
as its immediate left neighbour, contradicting its definition. The argument that 
the starting index cannot be smaller than /3min has already been given while 
describing the corresponding algorithm. Similar arguments can be made for Cmin 
and Cmax) thus completing the proof. m 



It is easy to see that all the algorithms given above, with the exception of 
the work performed in findjrontier, exhibit a worst-case time complexity that 
is linear in Xj~, the maximum length of a stretch of type r^. Because of the 
three nested loops in find frontier, its worst-case time complexity is in 0(\k ■ m ■ 
max{Ai : 1 < i < ’m})- This is still low since in practice the number of shift types 
seldom exceeds four and the maximum length of stretches rarely exceeds eight. 

Before proceeding further, we give a brief overview of the filtering algorithm. 
Two types of events are considered: when a value is removed from the domain 
of a shift variable, potentially breaking a stretch of the corresponding shift type, 
the possibility of a valid stretch of that type is verified on both sides of the 
variable; when a variable becomes bound, several filterings are applied based on 
where the beginning and end of the stretch through that variable may lie. 



'Cl 


'Cl 


'Cl 


'Cl 


{Xi,X,} 


{Xi,X,} 


Kp-1 



(a) 



Ti I Ti I Ti I Ti I T; Tt; T],- 

(b) 



Fig. 5. Detecting a broken stretch (a); a possible fragment of the schedule (b). 
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3.2 Detecting Broken Stretches 

Each time the domain of a shift variable Si is modified, the following algorithm 
may be applied: 

FI. 



for each re just removed from Vg. 

1. if r,e 

a. j <— computation of /3niin for a potential stretch of r^’s ending 

at {i — 1) mod n; 

b. if span(j, {i — 1) mod n) < \n or j = —1 then 
i. remove re from 

2 - 

a. j <— computation of Cmax for a potential stretch of r^’s starting 

at {i + 1) mod n; 

b. if span((i + 1) mod n,j) < or j = —1 then 
i. remove n from 



The algorithm considers in turn each value ti removed from the domain of 
Si- If a stretch of Ti’s may appear on the immediate left of Si, the left-hand 
side is examined (step 1). Step la determines the earliest beginning of such a 
stretch. If it cannot be long enough to be valid then value Ti is removed from the 
domain of n- Note that, at this time, we may not remove that value 

for further neighbours to the left since it might prevent a stretch ending a little 
before. For example, consider figure Ol with A^ = 3 and Xi = 5: value ti will 
be removed from ,, but ti is still possible for S(i_ 2 )mod m Ets shown in 

figure ISb. Nevertheless, that single removal in turn may trigger further deletions. 
The examination of the right-hand side (step 2) proceeds similarly. 

The application of this algorithm guarantees that any value left in the domain 
of a shift variable has enough peers in neighbouring shift variables to make up 
a minimum length stretch. This property simplifies the algorithm for filtering 
rules F5 and F6 in section lO 

Since FI features in the worst case 0{m) computations of /3min which itself 
includes a call to find-frontier, its worst-case time complexity is in 0{Xk ■ mf ■ 
max{Ai : 1 < i < ’™})- 



3.3 Reasoning on the Potential Extent of a Stretch 

Once we have the intervals, a number of filtering rules may be applied each time 
a shift variable is instantiated to a value t^. First, a value of —1 may have been 
returned by a call to find- frontier: 

F2. If /3min = —1 or £max = —1 then the stretch constraint is violated. 
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We may also simply discover that the stretch is necessarily too long or too short: 

F3. If span(/3max, emin) > Afc then the stretch constraint is violated. 

F4. If span(/3iiiin, Cmax) < Afc then the stretch constraint is violated. 



{T ,T } 
q’ p 



















Pmin ^ min 

Pmax 



Fig. 6. Illustration of rule F5. 

F5. If /3max = /3min then we know precisely where the stretch begins. Value 
may thus be removed from preceding shift variables: 

1. A <— oo; 

2. for each tp G ,, , such that r/rt G 77 

vPmax ~ -*-)moa n 

a. if A^ < A then A ^ A^; 

3. if A 7 ^ oo then for 7 = 1 to A 

a. remove rr from 'Ds,„ , . ; 

Step 2 computes the length A of the shortest feasible neighbouring stretch 
on the left so that step 3 may remove value Tk from the A shift variables 
immediately preceding the current stretch since those neighbours must nec- 
essarily be of a different type. For example in figure El suppose A/i = 3 and 
Aq = 2. So A = 2 and may be removed from the domain of 
F6. If Cinax = Cmin then we know precisely where the stretch ends. Value Tk may 
thus be removed from following shift variables, using an algorithm that is 
the symmetric counterpart of the previous one. 




Fig. 7. The slots with diagonal stripes are already fixed. The horizontally striped one 
can also be fixed, from rule F7. 

F7. Given the minimum length of the stretch, it may sometimes be the case 
that wherever it lies within the interval [/7min) Cmax]) a particular position is 
always covered by that stretch: 
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1, for i — 1 to A/j; Span(/?iYiin 5 ^min) 

'^(emin+Omod n 

2, for Z = 1 to A/c Span(/?rnax; ^^max) 

^(iSmax — i)mod n * Tkj 



Step 1 fixes the variables beyond emin which would be included in the short- 
est possible stretch (of length Afc) starting at the earliest possible position 
(/3min), if any, because these variables must be part of any valid stretch (see 
figure |7|- Step 2 does the same from the other end. 

The worst-case time complexity of the first three rules is in 0(1), that of rules 
F5 and F6 is in 0{m+ma,x{Xi : 1 < z < nz}), and in 0(max{Ai : 1 < z < m}) for 
F7. This concludes our complexity analysis — note that the overall complexity 
of the algorithm is not related to n. 



4 Experimental Results 

In this section we propose to evaluate the efficiency of our algorithm both on 
realistic problems and on a larger set of generated benchmarks. 

The stretch constraint as described in this paper was used to model and 
solve several real-life rotating schedule problem instances from the literature 
using an algorithm described in [5] . Though it is far from being the 
only reason for the success of the algorithm since a few other global constraints 
are present and a specially tailored search strategy was devised, its impact on the 
overall efficiency is significant, as shown in table[^ Seven versions of the filtering 
algorithm were tested: a complete one, five versions leaving out one particular 
filtering component]^, and a naive version. Each time a shift variable is bound, 
the latter passively checks that the current stretch through that variable is not 
too long and that there is sufficient room to reach minimum length. For each 
version of the stretch constraint and each problem instance, the number of 
failures is reported. 



Table 2. Number of failed branches for different versions of the algorithm applied to 
cyclic rostering problems from the literature. 



version of stretch 


alcan 


horot 


MOT 


butler 


laporte 


hung 


lau 


complete 


1004 


16 


0 


2024 


1 


100 


1 


without FI 


1491 


67 


0 


11422 


19 


68 


1 


without F2 


1004 


16 


0 


2024 


1 


68 


1 


without F4 


1004 


16 


0 


2024 


1 


100 


1 


without F5,F6 


1150 


50 


1 


4631 


1 


100 


48 


without F7 


1529 


55 


161 


154685 


121 


100 


1 


naive 


3578 


176 


204 


- 


457587 


68 


59 



F3 was not left out because it is necessary for the correctness of the algorithm. 



2 
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A first observation is that every instance is easily solved by the complete 
version, but not so easily by the naive version. In fact, one instance (butler) 
could not be solved within one hour of computation. Of the individual filtering 
rules tested, FI and F7 appear to be the most crucial while F2 and F4 have 
little effect. In fact, F4 is redundant with respect to FI or F7. 



Table 3. A comparison of the number of failed branches and computation time (in 
seconds on a sun Ultra 5 at 400MHz) between the complete and naive versions over a 
range of generated benchmarks. 



length 


^ values 


mean gap 


complete 


naive 








fails 


time 


fails 


time 


25 


3 


1 


10.8 


0.01 


4970.6 


0.74 






3 


1.8 


0.01 


206.2 


0.05 






5 


1.9 


0.02 


401.8 


0.04 




5 


1 


135.8 


0.13 


946310.0 


212.52 






3 


4.5 


0.02 


49323.9 


3.33 






5 


4.8 


0.02 


76988.1 


8.92 




7 


1 


13.3 


0.03 


386865.0 


73.14 






3 


55.7 


0.05 


1.36x10® 


240.19 






5 


23.0 


0.03 


179134.0 


32.34 


50 


3 


1 


31.2 


0.04 


- 


- 






3 


5.7 


0.04 


- 


- 






5 


1.9 


0.04 


566220.0 


105.36 




5 


1 


55.4 


0.04 


- 


- 






3 


158.6 


0.11 


- 


- 






5 


74.6 


0.03 


- 


- 




7 


1 


914.7 


0.38 


- 


- 






3 


2532.2 


0.77 


- 


- 






5 


37114.3 


1.97 


- 


- 



For a more thorough assessment of the stretch constraint, a set of instances 
of the following benchmark problem were generated: given the length of a circular 
sequence, the set of values that may be used in the sequence, as well as minimum 
and maximum lengths for a stretch of each of those values, find a valid sequence. 
Here every juxtaposition of shift types constitutes a permitted pattern. Granted, 
this is not a difficult problem if we proceed sequentially and with a bit of planning 
but we further impose that the sequence should be filled in random order with 
values randomly selected from the current domain (the pseudo-random number 
generator used the same seed for both versions of the constraint). This not 
only makes the problem harder to solve but also approximates a more realistic 
context in which fragments of the sequence may be preassigned or fixed through 
the intervention of other constraints. 

The first three columns of table describe the parameters of the instances. 
The mean gap (column 3) refers to the difference between the minimum and max- 
imum allowed stretch-lengths for a given value. In the following four columns. 
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each entry corresponds to an average over ten instances. Entries left blank in- 
dicate that the corresponding instances could not be solved within one hour of 
computation. 

It is difficult to notice a clear trend across the parameter space. With the 
exception of the occasional harder instance, it appears that the difficulty in- 
creases with the number of values allowed and, obviously, with the length of the 
sequence. The effect of the mean gap is unclear. More interestingly, the full ver- 
sion of the stretch constraint performs up to several orders of magnitude better 
than the naive version both in the size of the search tree and the computation 
time, even though more effort is expended during each call to the former. The 
smaller variance in the results of the full version is also noteworthy. 



5 Discussion 

Much of the filtering relies on the [/3miru /?max] and [emirnCmax] intervals com- 
puted. Unfortunately these intervals are not necessarily the tightest possible as 
the following example witnesses. 


















j 



Fig. 8. An example of an interval that could be tighter. 



The difficulty originates from find_frontier, in which permitted patterns are 
taken into account. Consider the situation depicted in figure |8] with Ap = 2, 
Xg = 1, and n = { ThTr, TrTp, T^T,, TpTfe, T^Tfe, TfcTs, TfcT/j}. After fixing the 
rightmost shift of this fragment to Tk, we wish to determine the corresponding 
/3min- Eventually we reach step 5b and call findjrontier with j as indicated in the 
figure. Since a stretch of r^’s may be as short as a single shift, j -I- 1 is returned as 
the value of /3min- However, closer inspection reveals that, several shifts back, a 
shift is fixed to and since neither t^Ts nor r^Tg belong to 7T, Ts cannot occur at 
j — 1. This in turn means Tg cannot occur at j (because TrTg ^ U), which leaves 
Tp. Since a stretch of Tp’s must include at least two shifts, /3min should rather 
be J -I- 2. Ultimately, this difference could translate into an earlier detection of a 
violated constraint (for example, through rule F4 in section [T3]) . 

Therefore a higher level of consistency could be achieved by examining a 
larger fraction of the sequence, potentially all of it, but at a higher computational 
cost as well since the complexity of the algorithm would then be related to n. A 
more efficient alternative would be to use the stretch constraint as described 
but in conjunction with a constraint for permitted patterns equipped with an 
appropriate filtering algorithm to prune the domains and thus avoid the situation 
depicted in figure [S] However this lies beyond the scope of the present paper. 
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6 Conclusion 

This paper presented a new global constraint on a sequence of variables. It may 
be useful whenever limits are given on the number of consecutive identical values 
in the sequence. One immediate domain of application is rostering and several 
supporting experiments in that area were reported. The filtering algorithm used 
by the constraint was shown to have low complexity and significant pruning 
capability. 

This constraint was successfully used in the multi-shift scheduling system 
described in |5] to model several of the constraints sometimes found in rostering 
problems: constraints on the length of work stretches of a given type or of mixed 
types, constraints on the length of stretches of days off, constraints on the number 
of consecutive weekends off, etc. It was also instrumental in constraining the 
number and spacing of stretches of each length, through a simple extension. 
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Abstract. We introduce a new global constraint for modeling and solv- 
ing network flow problems in constraint programming. We describe the 
declarative and operational semantics of the flow constraint and illustrate 
its use through a number of applications. 



1 Introduction 

Network flows are a fundamental concept in mathematics and computer sci- 
ence. They play an important role in various applications, e.g. in transportation, 
telecommunication, or supply chain optimization m- Many classical network 
models can be solved very quickly, they have naturally integer solutions, and they 
provide a modeling language for real world problems that is easier to understand 
than, e.g., the language of linear programming. 

In spite of their importance, constraint programming systems normally do 
not provide special support to deal with network flows. We introduce here a new 
global constraint flow for modeling and solving network problems inside con- 
straint programming. The flow constraint is complementary to existing global 
constraints. Typically, it is used together with other global constraints and all 
kind of side constraints. While pure network flow problems may be solved directly 
by specialized algorithms m, our goal here is to handle efflciently problems in 
constraint programming that involve network flows as a subproblem. 

Global constraints are a key concept of constraint programming. They were 
first introduced in the Chip system m- Since that time, they have been con- 
tinuously studied in the literature. Recent work on global constraints includes, 
e.g., [II 2)1 6117] . A classification scheme for global constraints is presented in [^. 
The role of global constraints for the integration of constraint programming and 
mathematical programming is discussed, among others, in I7I9I15I13I14I . 

There are two main benefits of global constraints. On the one hand, they 
provide high-level abstractions for modeling complex combinatorial problems in 

* This work was partially supported by the European Commission, Growth Pro- 
gramme, Research Project LISCOS - Large Scale Integrated Supply Chain Opti- 
misation Software, Contract No. GlRD-CT-1999-00034 

T. Walsh (Ed.): CP 2001, LNCS 2239, pp. lOO- FTITn 2001. 

(c) Springer- Verlag Berlin Heidelberg 2001 
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a natural and declarative way. They serve as building blocks for developing large 
applications. On the other hand, they make available efficient algorithms for solv- 
ing specific combinatorial problems within a general-purpose solver. Typically, 
global constraints give much stronger propagation than equivalent formulations 
based on elementary constraints, provided such formulations exist at all. 

The organization of this paper is as follows. We start in Sect. 2 with the 
declarative semantics of the new constraint. First we describe the underlying 
mathematical model, then we introduce the flow constraint in two different 
forms. A key feature of this constraint are the conversion nodes. They are particu- 
larly useful when modeling supply chain optimization problems. Sect. 3 discusses 
the operational semantics. We present a decomposition technique for generalized 
networks with conversion nodes and expose the main ideas used in propagation. 
Sect. 4 contains three applications of the flow constraint: maximum flow, pro- 
duction planning, and personnel scheduling. Finally, Sect. 5 briefly describes the 
current implementation of the flow constraint within the Chip system. 



2 A Global Constraint for Flow Problems 

In this section, we introduce the new global constraint flow. We start by de- 
scribing the underlying mathematical model. 



2.1 Generalized Flow Networks 

A generalized flow network Af = {V = U U A; I, u, c; 7 , d~ ,d~^ , q) is a 

directed network of n nodes and m arcs, where 

— F is the set of nodes, which is partitioned into three subsets V^, V‘^, and F° 
of supply, demand, and conversion nodes respectively; 

— if is the set of directed arcs; 

— l,u E are lower and upper capacity functions; 

— c : if ^ M is an edge cost function; 

— 7 : E{V, V^) ^ K is a conversion function; 

— g : F ^ M is a node cost function; 

— d~,d+ : F^R+ are lower and upper demand functions. 

Here E{X,Y), for X,Y C F, denotes the set {(v,w) G if : n G A, w G F} of 
arcs leaving X and entering Y. An arc (v,w) G E(V,V^) is called a conversion 
arc. 

A pseudoflow in Af is a function / : if ^ R that satisfies the capacity 
constraints 



l{v,w) < f{v,w) < u{v,w), (v,w) G E. 



( 1 ) 




198 A. Bockmayr, N. Pisaruk, and A. Aggoun 



For a pseudoflow /, the inflow, outflow, and excess at node v & V are deflned by 

inf{v)'^= ^ f{w,v), 

outf{v)'^= ^ f{v,w), 

{v,w)^E{v,Y) 

excf{v) inf{v) — outf{v). 

A circulation is a pseudoflow f in Af with excf{v) = 0, for all v &V . 

A pseudoflow f is a flow if it satisfies the balance constraints 

d~{v) < —excf{v) < d'^{v), v G V^, 

d~{v) < excf{v) < d'^{v), v G V^, (2) 

d~{w) < outf{w) < d'^{w), w G V^, 

and the flow conversion constraints 

f{v,w)=j{v,w)-outf{w), (v,w) G E{V,w), w G V‘^. (3) 

Demand nodes have non-negative excess, supply nodes have non-positive ex- 
cess (i.e., a deficit). Conversion nodes are a key feature of the generalized flow 
networks introduced in this paper. They are particularly useful when modeling 
production processes, see Sect. 14.21 The flow conversion constraints allow one to 
state, e.g., that in order to produce 1 unit of product P, we need 1 unit of raw 
material Ri and two units of raw material i? 2 - 
The cost of a pseudoflow / is the value 

c(/)" i: c{v,w) f{v,w) + E q{v)excf{v)+ q{v)outf{v). 

{v,w)^E 

Given a network A/", the goal is usually to And a flow of minimum cost, i.e., to 
solve a minimum cost flow problem. 

2.2 The Flow Constraint 

To handle flow problems on generalized networks within constraint program- 
ming, we introduce a global constraint flow of the following form: 

f low(NodeType, Edge, Conv, EdgeCost, NodeCost, Demand, Flow, FlowVal), (4) 

where 

— NodeType: a list [si, . . . , s„] of values from the set {supply, demeind, conv}; 
Si specifies the type of node i, respectively, supply, demand, or conversion; 

— Edge: a list of lists [[ti, hfl, . . . , [tm, ^m]] of values U, hi from the set V = 
{1, . . . , n}; ti, hi are the tail and head of arc i; 
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— Conv: a list [ 71 , . . . ,7m] of rational values 7 i or if defined, 7 ^ is the con- 
version factor of the conversion arc i; 

— EdgeCost: a list [ci,... ,Cm] of rational values c,; Ci is the unit flow cost 
along arc i; 

— NodeCost: a list [ 91 , . . . , Qn] of rational values qf, qi is the unit cost at node i; 

— Demand: a list [di,. . . , d„] of variables df, di is the demand at node i and 
takes values from an interval [d~ ,df] C M+; 

— Flow: a list [fi, ■ ■ ■ ,/m] of variables /^; fi is the flow along arc i and takes 
values from an interval [li,Ui] C K.+ ; 

— FlowVal: a domain variable or a rational value. 

In the context of finite domain constraint programming, we assume that all 
variables are defined over a finite domain of integer numbers. Note, however, 
that the algorithms described in this paper, can easily be extended to variables 
ranging over an interval of rational numbers. This is important when using the 
flow constraint within a hybrid CP/MIP solver. If the list EdgeCost (resp. 
NodeCost) is empty, the edge (resp. node) costs are assumed to be zero. 

A flow constraint is satisfiable if, in the network Af that is defined by its 
arguments, there exists a flow whose cost value is FlowVal. 

For large networks, it may be preferable to define the flow constraint in the 
following equivalent form: 

f low([Nodei, . . . , Node„], FlowVal), (5) 

where, for i = 1, . . . , n, Node^ is a list of the form 

[[Sj , dj , [tjp , , Ui^l , Cjp, /ip]], ..., , li^k{i) : ^ /i,fc(i)]]] 

and 

— Si is a value from the set {supply, demand, conv„ith, conv„itiiout}) indicating 
whether node i is a supply node, a demand node, a conversion node with 
excess, or a conversion node without excess, respectively. 

— di is a variable, the demand at node i; it takes values from an interval 
[d-,d+] CK+; 

— for j = l,...,fc(i), 

• Vij is a value from { 1 , . . . ,n}; 

(uip, i), . . . , (t’i,/c(i), *) are the arcs entering node i; 

• Cij is a rational number, the cost of arc 

• fij is a variable, the flow along arc 

it takes values from an interval [h,j,Ui,j] C K.+ . 

The interest of this alternative form of the flow constraint is that it can be 
constructed locally, i.e. by assembling separately data about the arcs entering 
each particular node v. 

3 Operational Semantics 

In this section, we present the operational semantics of the flow constraint. A 
key question is how to handle conversion nodes. First we show how a generalized 
flow network Af with conversion nodes can be decomposed into smaller networks 
Afi,. . . , J\fk such that circulations in A/) yield flows in J\f and vice versa. 
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3.1 Decomposition into Subnetworks 

Let M = [V = i/® U U be a generalized flow net- 

work as deflned in Sect. E31 Let G = {V, E) denote the graph of the net- 
work J\f. Let Gi = {Vi, Ei), i = 1, . . . ,k, he the weak components of the sub- 
graph G' = {V,E{V,V \ For i = 1, . . . , k, we build the flow network 

= {Vi,Ei,E ,u'‘) as follows. First we add a new node St, not previously in 

def — — 

V, and set V = V U {si}. Next, we extend the set of arcs Ei with two new 
families (see the example at the end of this section): 



/d — 

Ei = Ei U Ei U Hi, where 

A {(Sz, u)} : u e y, n (V^ U V^)} U {(u, s,)} :vevn V^}, and 
H, = {{v,s,r:{v,w)€E{V,V^)}. 

For each supply and conversion node v G V we include in Di the arc {si, v), and 
for each demand node v G Vi the arc {v,Si). Arcs {v,w) in the original network 
N that lead from a node v G Vi to a, conversion node w are represented in Gi by 
an arc {v,Si) G Hi that is labeled with the superscript ’’ic”. The cost function 
c* and the capacity functions on Ei are deflned as follows: 



d{v,w) = c{v,w), r{v,w) 
d{si,v)'^= q{v), 
E{v,Si)=\{v), l"{v,Si) 
E{v,Si)'^ c{v,w), k{v,Si)^ 



= l{v,w), u^{v,w) = u{v,w), {v,w) G Ei 

d~{v), u*(si,u) d+(u), {si,v) G Di 

d~{v), u^{v,Si) d+(u), (v,Si) G Di 

l{v,w), Si)“ u{v,w), (v,Si)'^ G Hi 



For i = 1, . . . , fc, let /* be a circulation in A/); if the collection (/^, . . . , /^) 
satisfies the constraints 



f{v,s,r = l{v,w)P{s„w), {v,w) G E{v \ V-,V, n K) , 

then it determines in the network Af a flow / which is deflned by 



f{v,w) = 



P{v,w), \i {v,w) G E{V^,Vi), _ 
f{v,Sir, ii{v,w)GE{V,V\Vi). 



(6) 

( 7 ) 



Furthermore, the cost of the flow / is equal to the sum of the costs of the 
circulations /^, . . . , /^, i.e.. 



k 

c{f) = Y.^\n. 
2=1 



(8) 



where &{p) = T,(v,w)eE,c"{v,w)f{v,w). 

Conversely, a flow f in Af uniquely determines the collection of circulations 
(/^, . . . , fp deflned by d7|) and 

P{si,v) = -excf{v), vGVr\V‘, 

f’’{si,v)= outflv), vGVr\V^, 

p{v,Si)= excf{v), vGV.r}^^. 
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Example. Let us consider the generalized flow network A/" depicted in Fig. [T] 
The triples inside the nodes and near the arcs represent: 

— ((?(u); for a node v; 

— {c{v,w);l{v,w),u{v,w)), for anon-conversion arc (v,w)] 

— (c(?;, w); a(u, w)//3(u, w)), for a conversion arc (z;, w), where 7 (u, w) = . 

Suppose that the conversion arcs have lower capacity 0 and upper capacity 10. 




Fig. 1. Network A/” 



The decomposition of JV into 3 subnetworks is presented in Fig. The flow 
in network Af, represented by the italic numbers in Fig. [U corresponds to the 
circulations depicted in Fig. |2] 

3.2 Propagation 

In the previous section we have shown that, to And a flow / in a generalized flow 
network Af, we can decompose Af into smaller subnetworks Jfi , ■ ■ ■ , Jfk and then 
look for a collection of circulations (/^, . . . ,/^) obeying the linear constraints 
( 0 . It remains to discuss propagation on these subnetworks. This is based on 
classical network algorithms. Due to lack of space, we can describe here only the 
main ideas. 

For i = 1, . . . , fc, we verify whether there exists a circulation in every network 
Jfi\ if, for some i, the answer is negative, then there is no flow in Jf and we are 
done. Otherwise, for each network Afi, we apply two propagation subroutines 
to reduce the feasible intervals [l{v,w),u{v,w)] for the flow variables f{v,w). 
These subroutines are called recursively and in cooperation with a propagation 
procedure for the linear constraints and 
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Fig. 2. Decomposition of network J\f 



Suppose that we are given a circulation network, i.e. a flow network of the 
form CAf = iV, E; I, u, c), without 7 , ,d~ ,q. By Hoffman’s theorem [TU], there 

is a circulation in CN iff 

l{v,w)< Y u{v,w), for all X C H. (9) 

{v,w)GE(X,V\X) (v,w)GE(V\X,X) 

By a single maximum flow computation (see m), we can And either a circulation 
in CAf or a subset X for which inequality ((HD is most violated. 

Our first propagation subroutine is based only on feasibility reasoning. For an 
arc {v,w) S E, let a{v,w) (resp. (3{v,w) ) denote the maximum flow value from 
V to w (resp. from w to v) in the network (V, E\{(v, w)}; I, u). The recursive step 
of our first propagation subroutine calculates (or only estimates) for {v,w) € E 
the values a{v,w), fi{v,w), and then replaces l{v,w) by ma,yi{l{v,w),a{v,w)}, 
and u{v,w) by min{u{v,w), P{v,w)}. 

Our second subroutine is based on optimality reasoning. Let B be an upper 
bound on the cost of a minimal circulation. The subroutine first computes an 
optimal circulation / and an optimal price function p : y ^ M, i.e., such that 
the complementary slaekness eondition holds 

Cp{v,w) <0^ f{v,w) = u{v,w), . . 

Cp{v, w) > 0^ f{v, w) = l{v, w), ^ ’ 



where Cp{v, w) p{v) + c{v, w) — p{w) denotes the redueed east of an arc {v, w) 
with respect to a price function p (see again E). Then it changes the lower and 
upper capacities according to the rule: 



— if Cp{v,w) < 0 and e 
u{v, w) — e; 

— if Cp{v,w) > 0 and 5 
l{v, w) + 5. 



Cp{v,w) 



< u{v,w) 



l{v,w), then set l{v,w) = 



B-c{f) 

Cp{v,w) 



< u{v,w) 



l{v,w), then set u{v,w) = 
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4 Some Applications 

We present in this section a number of applications of the flow constraint. The 
list of examples given here is by no way exhaustive. We can illustrate here only 
some basic features of the flow constraint. More advanced applications would 
include, e.g., cyclic time tabling, multicommodity flows, network design, and 
flow problems with various side constraints, like the equal flow problem [^. 

4.1 Maximum Flow 

It is quite natural that the flow constraint can be used for solving most of the 
classical network problems. Here, we demonstrate this for the maximum flow 
problem. 

Consider a network {V,E;u,l,s,t), with n = \V\ nodes and m = \E\ arcs. 
l,u : E M. are lower and upper capacity functions, s S F is a source, and 
t € V is a, sink. The maximum flow problem consists in finding a flow f in G 
such that excf{v) = 0 for all u G F \ {s,t} and excf{t) = —excf{s), the value of 
the flow /, is maximal. 

As an example let us consider the instance of the maximum flow problem 
represented in Fig.[3l Here, the numbers in the parentheses are the arc capacities, 
the other numbers represent a maximum flow of value 15. This flow is obtained 
by the following simple solution strategy (see also Fig. 0: 

— set up the list representation of the network; 

— post one flow constraint and do propagation; 

— fix the value of Demand[source] to its upper bound (this starts propagation 
again); 

— in turn, fix each flow variable to any value from its current domain (followed 
each time by propagation). 




Fig. 3. Maximum flow: network 
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n = 8; m = 16; s = 0; 1 = 7; 

NodeType = [supply, supply, supply, supply, supply, supply, demand! ; 
Edge = [[0,1], [0,2], [0,3], [1,2], [1,3], [1,4], [2, 3], [2, 5], [2, 6], 

[3,4], [3,5], [4,5], [4,7], [5,7], [6,5], [6,7]]; 

LoCap = [2, 1,0, 1,2, 1,0, 0,2, 1,1, 2,1, 0,1,0]; 

UpCap = [8,5,3,6,4,5,5,3,5,5,4,5,3,8,3,9]; 

Demand]^] = 0, v ^ s,t\ 

Demand[s] € [0,16]; 

Demand]!] € [0,16]; 

Flow[e] G [LoCap[e],UpCap[e]], e = 0, . . . , m — 1; 
flow(NodeType,Edge, [],[], [] , Demand, Flow, 0). 

Demand[s] <— max_val_in_domain(Demand[s]); 
for (e = 0, . . . , m — 1) 

Flow[e] <— min_vaLin_domain(Flow[e]); 



Fig. 4. Maximum flow: model and solution procedure 



4.2 Production Planning 

Suppose there are two types of manufacturing facilities Fi, F 2 for producing a 
discrete product P. In both facilities, two raw materials Ri and R 2 are used. Up 
to 400 units of Ri and up to 700 units of R 2 are available. One unit of i?i costs 
5$, and one unit of i ?2 costs 7$. Because of different technologies, the quantities 
of the raw materials used for producing one unit of product P are different in 
Fi and F 2 , see the following Tab. [T] 



Table 1. Production planning: data 





Ri 


R 2 


Pi 


1 


2 


P 2 


1 


3/2 





Si 


52 


5s 


Pi 


1 


1 


2 


P 2 


2 


1 


1 



The production cost of one unit of product P is 12$ in facility Fi, and 10$ 
in facility F 2 - The maximum capacities of facilities Fi and F 2 are, respectively, 
200 and 250 units of the product. Furthermore, at least 100 resp. 150 units 
of the product must be produced in the facilities Fi resp. F 2 . The demands 
for product P at the customer sites. Si, S 2 , and S 3 , are 160, 70, 140 units 
respectively. The unit transportation costs for shipping units of products from 
facilities to customers can also be found in Tab. |T] The problem is to determine 
the production rates and the shipping patterns to meet all the demands at a 
minimum cost. 

We formulate this problem as a generalized flow problem in the network 
given in Fig. O Nodes 0 and 1, respectively, represent the raw materials Ri and 
R 2 ] nodes 2 and 3 are production facilities for product Pi and P 2 respectively; 
nodes 4,5, and 6 are customers nodes that represent the sites ^i, S 2 , and S 3 . 
The numbers in parentheses at the nodes are the lower and upper demands. The 
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numbers inside the circles are the costs, and the numbers inside the rectangles 
are the conversion factors. We define the upper capacity of an arc as the upper 
demand of its head node. All lower capacities are zero. 



(0,400) 



(100,200) 




(150,250) 



(160,160) (70,70) (140,140) 

Fig. 5. Production planning: network 



A complete model for this problem is given in Fig. [H Labeling the demand 
variables and fixing the flow variables yields, after running the corresponding 
C implementation, an optimal flow /(0,2) = 120, /(0,3) = 250, /(1, 2) = 240, 
/(1, 3) = 375, /(2,4) = 120, /(2,5) = /(2,6) = 0, /(3,4) = 40, /(3,5) = 70, 
/(3,6) = 140, whose cost is 13095. 



4.3 Personnel Scheduling 

The telephone service of an airline operates around the clock. Tab. E] indicates 
for 6 time periods of 4 hours the number of operators needed to answer the 
incoming calls. 



Table 2. Personnel scheduling: data 



Period 


Time of day 


Min. operator needed 


0 


3 a.m. to 7 a.m. 


26 


1 


7 a.m. to 11 a.m. 


52 


2 


11 a.m. to 3 p.m. 


86 


3 


3 p.m. to 7 p.m. 


120 


4 


7 p.m. to 11 p.m. 


75 


5 


11 p.m. to 3 a.m. 


35 
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n = 7; m = 10; 

NodeType = [supply , supply , conv, conv, demand, demand, demand]; 

Edge = [[0,2], [0,3], [1.2], [1,3], [2, 4], [2, 5], [2, 6], [3, 4], [3, 5], [3, 6]]; 

Conv = [l,l,2,3/2,-, 

EdgeCost = [3,4, 2, 2, 1,1,2, 2, 1,1]; 

NodeCost = [5,7,12,10,0,0,0]; 

LoDem = [0,0,100,150,160,70,140]; 

UpDem = [400,700,200,250,160,70,140]; 

UpCap = [200,250,200,250,160,70,140,160,70,140]; 

Demand[n] £ [LoDem[v],UpDem[v]], n = 0, . . . ,n — 1; 

FlowCost G [0,100000]; 

Flow[e] G [0, UpCap[e]], e = 0, . . . , m — 1; 

flow(NodeType, Edge , Conv, EdgeCost , NodeCost .Demand, Flow, FlowVal). 
if (labeling(Demand)) { 

FlowVal = min_vaLin_domain(FlowVal); 
for (e = 0, . . . , m — 1) 

Flow[e] <— min_val_in_domain(Flow[e]); } 



Fig. 6. Production planning: model and solution procedure 



We assume that operators work for a consecutive periods of 8 hours. They can 
start to work at the beginning of any of the 6 periods. Let Xt denote the number 
of operators starting to work at the beginning of period t, t = 0, . . . ,5. We need 
to find the optimum values for Xt to meet the requirements in all the periods, 
by employing the least number of operators. 

Any feasible schedule x = (xq, a:i, CC2, 0:3, 0:4, X5) that meets the requirements 
on the operators in the different time periods can be represented by a circulation 
/ in the network depicted in Fig. 0 



0 35 




Fig. 7. Personnel scheduling: network 



In this network, every node t corresponds to the beginning of period t, t = 
0, . . . ,5. There are two types of arcs: working arcs (t,t + 1 (mod 6)) and free 
arcs (t, t+4 mod 6). A flow f{t, t+1 mod 6) = Xt+a:(t_|_5)niod6 along a working arc 
(t, t + 1 mod 6) corresponds to the number of operators scheduled to work during 
period t; therefore, the lower capacity of this arc (number given in parentheses) 
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is defined to be the number of operators needed during that period. A flow 
f{t, {t + 4) mod 6) = a:(t+ 4 )mod 6 along a free arc {t, {t + 4) mod 6) corresponds 
to the number of operators having free time during periods t, t + 1, t + 2, t + 3; 
its lower capacity is zero. It can be easily checked that we can set the upper 
capacity of each arc to 120 (the maximal number of operators needed for one 
period). 

The circulation represented by the numbers on the arcs in Fig. Elyields a fea- 
sible schedule x = (11,41,45,75,0,35). In fact, this is even an optimal schedule. 
However, an arbitrary circulation does not always determine a feasible schedule. 
In general, it may violate the requirement that each operator works for a consec- 
utive period of 8 hours. In other words, this means that the number of operators 
working during some period must be equal to the number of operators starting to 
work at the beginning of this period plus the number of operators finishing their 
work at the end of this period. To meet this condition, a schedule- circulation f 
must comply for t = 0, . . . , 5 with the side constraints 

f{t, {t -\- 1) mod 6) = /((t — 4) mod 6,t) -I- /((t -I- 1) mod 6, {t -\- 5) mod 6). 

(11) 

We define arc costs c{t, {t 1) mod 6) = 1 and c{t,t 4 mod 6) = 0, for t = 
0, . . . ,5. Since each operator works during two consecutive periods, the cost c(/) 
of a circulation / is equal to twice the number of operators employed. If / is 
an optimal schedule-circulation, then the optimal values for Xt are defined by 
Xt = f{{t — 4) mod 6, t), for t = 0, . . . , 5. 

The solution algorithm is very simple and given in Fig. |S] For Flow to be a 
circulation, we post one flow constraint. Since, for a circulation, the demand at 
any node is zero, we can set NodeType [v] =supply for every node v. To satisfy 
equations dI3, we post n linear constraints. Finally, we solve the problem using 
the min _max procedure which labels variables Flow in order to minimize variable 
FlowVal. 



n = 6; m = 10; 

OpNeeded = [26,52,86,120,75,35]; 

UpCap = maxo<i<n 0pNeeded[i]; 

NodeType = [supply , supply , supply , supply , supply , supply]; 

Edge = [[0,1],[1,2],[2,3],[3,4],[4,5],[5,0],[0,4],[1,5],[2,0],[3,1],[4,2],[5,3]]; 
EdgeCost = [1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0]; 

Demand[«] =0, v = 0, ... ,n — 1; 

Flow[e] G [OpNeeded[e], UpCap], e = 0, .. . ,n— 1; 

Flow[e] G [0, UpCap], e = n, . . . , m — 1; 

FlowVal G [0,UpCap-n]; 

flow(NodeType,Edge, [] , EdgeCost, [] .Demand, Flow, FlowVal). 
for (i = 0, . . . , n — 1) 

Flow[i] = Flow[n -|- ((i -|- 1) mod n)] + Flow[n -|- ((i -|- 2) mod n)]; 
min_max(labeling(Flow), FlowVal); 



Fig. 8. Personnel scheduling: model and solution procedure 



208 A. Bockmayr, N. Pisaruk, and A. Aggoun 



5 Implementation 

The flow constraint has been implemented using the Chip/C Library [H]. This 
gives us access to low-level primitives of the Chip/C kernel in order to control 
the propagation mechanisms described in Sect. 13.21 The C-|— I- layer on top of the 
C version of the flow constraint, which is needed for Chip/C-|— k, is currently 
under development. 



void PersonnelScheduling{mt* OperNeeded) { 
const int n=6, m=12; 

tagNodeType NodeType[ ] = {supply , supply , supply , supply , supply , supply}; 
int Tail[ ] = (0, 1, 2, 3, 4, 5, 0, 1, 2, 3, 4, 5}; 
int Head[ ] = {1, 2, 3, 4, 5, 0, 4, 5, 0, 1, 2, 3}; 
int EdgeCost[ ] = {1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0}; 

DvarPtr Demand[6], Flow[12] , FlowVal; 
int i, UpBound=0, Ones[ ] = {1,1}; 

DvarPtr* Var[2]; 
for (i=0; i < n; i-k- k) { 

c_create_domain_array(Demand-ki, 1,0,0, INTERVAL); 

if (UpBound < OperNeededfi]) UpBound =OperNeeded[i]; } 
for (i=0; i < n; i-k-k) 

c_create_domain_array(Flow-ki,l, OperNeeded [i], UpBound, INTERVAL); 
c_create_domain .array (Flow-kn,n,0, UpBound, INTERVAL); 
c_create_domain .array (&FlowVaI,l,0,n*UpBonnd, INTERVAL); 
for (i=0; i < n; i-k-k) { 

Var [0] =FIow[n-k (i-k 1 ) %n] ; Var [1] =Flow [n-k (i-k2) %n] ; 

c.dom.Iinear.sum(Var,Ones,2,Flow[i]); } 

Flow(n,m,NodeType, Tail, Head, NULL, Demand, 0, NULL, 
Flow,m,NULL,l,NULL,EdgeCost,&FIowVaI); 
c.min.max(Flow,m,&FIowVal,l,METHOD.MAX.OF.MIN,NULL,NULL) ; 
printf(” Schedule: %d operators needed\n” ,c.domain.min(FlowVal)/2); 
printf(” Period : Starting work\n”); 
for (i=0; i < n; i-k-k) 

printf(” %d %d\n” ,i,c.domain.min(FIow[n-k(i-k2)%n])); } 



Fig. 9. Personnel scheduling: C implementation 



There is an almost one-to-one correspondence between the parameters of the 
flow constraint described in Sect. and the parameters of the C function Flow 
that implements the constraint. In Fig. [Ql we present a C implementation of the 
procedure in Fig. [S] which solves the personnel scheduling problem described 
in Sect. 14.31 The procedure takes as input an integer array OperNeeded of size 
n = 6, where OperNeeded [i] is the number of operators needed during period i. 
We specify the graph of Fig. fusing constant arrays Tail and Head. Furthermore, 
we are using the following Chip/C primitives: 

— DvarPtr x: defines a pointer x to a domain variable. 

— c_create_domain_array(array,n, min, max, INTERVAL) : creates an array 
of n domain variables ranging over the interval min and max. 
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— c_min_max(Flow,m,FlowVal, 1,METH0D_MAX_0F_MIN) : branch and bound 
method minimizing the domain variable FlowVal by enumerating variables 
of the array Flow, which contains m domain variables; the variable selection 
used is METH0D_MAX_0F_MIN (decreasing order of lower bounds). 

6 Conclusion and Further Research 

We have introduced in this paper a new global constraint flow for modeling and 
solving network flow problems in constraint programming. We have described 
the declarative and operational semantics and presented a number of illustrating 
examples. 

Our work was motivated by problems in supply-chain optimization that we 
encountered through our participation in the LiSCOS project (Large Scale Inte- 
grated Supply Chain Optimization Software based on Branch-and-Cut and Con- 
straint Programming), funded by the European Community. We are currently 
studying how the flow constraint can be used in some large-scale supply chain 
optimization problems provided by our industrial partners. In particular, we are 
investigating models for batch processing problems occurring in the chemical in- 
dustry, where nodes represent feeds (raw materials, intermediate products) and 
process operations, and arcs indicate the flow of the material. The flow con- 
straint is complementary to the global constraints cumulative and assignment 
of CHIP. It allows us to handle in an efficient way the various constraints on 
stocks occurring in scheduling problems. 

This paper has focussed on using the flow constraint within finite domain 
constraint programming. It is clear that the flow constraint can also be used as 
a mixed global constraint in a hybrid CP/MIP solver. This is another important 
topic for further research, see m for related work. 
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Abstract. The paper presents propagation rules that are common to the 
minimum constraint family and to the number of distinct values constraint 
family. One practical interest of the paper is to describe an implementation of 
the number of distinct values constraint. This is a quite common counting 
constraint that one encounters in many practical applications such as 
timetahling or frequency allocation problems. A second important contribution 
is to provide a pruning algorithm for the constraint “at most n distinct values for 
a set of variables”. This can be considered as the counterpart of Regin’s 
algorithm for the alldifferent constraint where one enforces having at least n 
distinct values for a given set of n variables. 



1 Introduction 

The purpose of this paper is to present propagation rules for the minimum constraint 
family as well as for the number of distinct values family that were introduced in [1]. 
The minimum constraint family has the form minimum(M, r,{Vi, where M is a 
variable, r is an integer value ranging from 0 to n-1 , and is a collection of 

variables. Variables take their value in a finite discrete set of items. The constraint 
holds if M corresponds to the item of rank r according to a given total ordering 
relation 91 between the items assigned to variables Ui,..,T„|i] For instance 

minimum(4, 2,{9,3,3,4}) fails since 4 is not the (2+1)^^ smallest distinct value of 9, 3, 3, 4 
while minimum(9, 2, {9, 3,3, 4}) succeeds. If there is no such item of rank r , M takes 
the maximum possible value over all items. Relation 9? is defined in a procedural 
way by the following functions that will be used in order to make our propagation 
algorithms generic: 

- min_item returns an item that corresponds to a value that is less or equal than all 

items that can be taken by variables , 

- max_item returns an item that corresponds to a value that is greater or equal than 
all items that can be taken by variables Fi,..,V„ , 



' This is different from the problem of finding the (r+l)t^ smallest value [2, pages 185-191]: in 
our case all the variables that have the same value have the same rank and we want to find 
the (r-l-l)t^ smallest distinct value. For instance, the second smallest distinct value of 
9,4,1, 3, 1,4 is equal to 3 (and not 1). 

T. Walsh (Ed.): CP 2001, LNCS 2239, pp. 211-224, 2001. 
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- / ^ y is true iff item I is less than item J , 

- / ^ 7 is true iff item 1 is greater than item J , 

- next(/) : if / max_item then returns the smallest item that is greater than item I , 

- prev(/) . if I min_item then returns the largest item that is smaller than item / , 

- min(v ) returns the minimum item that can be assigned to variable V , 

- max(v) returns the maximum item that can be assigned to variable V , 

- remove_val(\/,/) removes item I from the feasible values of variable V , 

- adjust_min(v,/) adjusts the minimum feasible value of variable V to item 7, 

- adjust_max(v,7) adjusts the maximum feasible value of variable V to item I . 

Defining a member C of the minimum constraint family will be achieved by providing 
the previous set of functions for the total ordering relation 9? that is specific to 
constraint C. This has the main advantage that one can introduce a new member of the 
family without having to reconsider all the propagation algorithms. The complexity 
results about the algorithms of this paper assume that all functions used for defining 
9? are performed in 0(1). 

The number of distinct values family has the form nclass{C,^i,..,V,^\Eq) where C is a 
variable, {Vj,..,V„} is a collection of variables, and Eq an equivalence relation among the 
possible values of The constraint holds if C is the number of distinct equivalences 

classes taken by the values of variables according to the equivalence relation Eq . 

The next section presents some instances of the minimum constraint family. Sect. 3 
and 4 present two algorithms that are used several times by the different pruning 
algorithms. These algorithms provide a lower bound for the minimum number of 
distinct values and for the {r + l)lh smallest distinct value. Sect. 5 shows how to 
reduce the domain of variable M , while Sect. 6 explains how to shrink domains of 
variables Vj,..,V„ . Finally, Sect. 7 indicates how to use the algorithms of this paper in 
order to implement the propagation for the number of distinct values constraint. 



2 The Minimum Constraint Family 

This section lists some instances of the minimum constraint family and provides the 
corresponding functions, which define the total ordering relation 9? , for two of the 
specified instances. Finally it gives one practical application within the domain of 
resource scheduling. 

Examples of the minimum family are; 

- minimum(M/iV, : MIN is the minimum value of VARi,..,VAR„ , 

- maximum(MAk', {yA7?i,..,yA7?„}); MAX is the maximum value of VARi,..,VAR„ , 

- mm_n{MIN,r,\yARi,..,VAR„}): MIN is the minimum of rank r of VARi,..,VAR„ , or 
max_item if there is no variable of rank r ^ 



^ Note that, removing value max_item from the possible values of variable MIN , will 
enforce the minimum of rank r to be defined. 
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- max_n(MAX,/-,{FA/?i,..,FA/?„}): MAX is the maximum of rank r of VARi,..,VAR„ , or 
min_item if there is no variable of rank r , 



minimum_pair(PA/f?, {PAIRi ,..,PAIR„ }) : 


PAIR 


is the 


minimum pair 


of 


PAIRi,..,PAIR„ , 

maximum_pair(/M/iJ, {PAIRi,..,PAIR,j}) : 


PAIR 


is the 


maximum pair 


of 



PAIRy,..,PAIR„ . 



Table 1. Functions associated to the maximum and minimum _pair constraints 



- . constraint 

runction 


maximum 


minimum _j?air 


min_item 


MAXINT 


(MININT, MININT) 


max_item 


MININT 


(MAXINT, MAXINT) 


I < J 


i>j 


{l.x < J.x)v{l.x = J.xAl.y<J.y) 


I y J 


i<j 


{l.x > J.x)v [l.x = 7.x A I.y > J.y) 


next(/) 


7-1 


IF 7.y <MAX_Y THEN {I.x,I.y+l} 

ELSE (7.X+1,MIN_Y) 


prev(/) 


7+1 


IF 7.y>MIN_Y THEN (I.x,I.y-\) 
ELSE (7.x-I,MAX_Y) 


min(v) 


max_var( V ) 


(min_var( V.x ),min_var( V.y )) 


max(F ) 


min_var( V 


(max_var( V.x ),max_var( V .y )) 


remo ve_val(V , / ) 


remove_val_var( V , 7 ) 


IF V.x = l.x TFIEnE] remove_vaI_var( V.y , I.y ) 
IF V.y = I.y THEN0remove_val_var( V.x , 7.x) 


adjust_min(F,/) 


adjust_max_var( V ,I ) 


adjust_min_var( V .x , l.x ) 

IF max_var( V.X )= 7.x THEN0 
adjust_min_var( V.y , I.y ) 


adjust_max(F,/) 


adjust_min_var( V , 7 ) 


adjust_max_var( V.x , l.x ) 

IF min_var( V.x )= 7.x THEN0 
adjust_max_var( V .y , I.y ) 



In all the previous constraints, MIN , MAX and VARi,..,VAR,^ are domain variables^ 
while PAIR and PAIRi,..,PAIR,^ are ordered pairs of domain variables. Table 1 gives 



^ For the if conditional statement we should generate the constraint: V.x=/.x ^ V.y^I.y . 

* For the if conditional statement we should generate the constraint: V.y=I.y => V.x^I.x . 

^ For the if conditional statement we should generate the constraint: V.x=I.x => V.y^.y . 

® For the if conditional statement we should generate the constraint: V.x=I.x => V.y^.y . 

’ A domain variable is a variable that ranges over a finite set of integers; min(V ) and max(F ) 
respectively denote the minimum and maximum values of variable V . 
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for the maximum and minimum _pair constraints the different functions introduced in 
the first section. For minimum _pair .x and .y indicate respectively the first and 
second attribute of a pair, while min y and max y are the minimum and maximum 
value for the .y attribute, minint and maxint correspond respectively to the 
minimum and maximum possible integers. min_var( V ) (respectively max_var( W )) 
returns the minimum (respectively maximum) value of the domain variable V . 
remove_val_var( V , / ) removes value / from variable W . adjust_min_var( F , 7 ) 
(respectively adjust_max_var( V , / )) adjusts the minimum (respectively maximum) 
value of variable V to value / . 

We finish Sect. 2 by providing a practical example of utilization of the 
min_n(M/iV,r,{F47?i,..,FA7;„}) constraint for modeling a specific type of precedence 
constraint. Assume we have a set T of n tasks which all have a duration of one and 
which are in disjunction. Furthermore let End-^,..,End„ be the end variables of the 
tasks of T, and let Start be the start of one other task which should not start before the 
completion of at least m tasks of T. This generalized precedence constraint can be 
modeled by using the conjunction of the following constraints: 
min_n(5,m-l,{7i«di,..,7i«(7„}) and Start>S. On one side this allows expressing 
directly the disjunctive constraint within the generalized precedence constraint. As a 
consequence this also leads to adjusting the minimum value of the Start variable both 
according to the precedence constraint and to the fact that the tasks of T should not 
overlap. 



3 Computing a Lower Bound of the Minimum Number of Distinct 
Values of a Sorted List of Variables 

This section describes an algorithm that evaluates a lower bound of the minimum 
number of distinct values of a set of variables {Ui,..,U„} sorted on increasing 
minimum value. This lower bound depends on the minimum and maximum values of 
these variables. Note that this is similar to the problem of finding a lower bound on 
the number of vertices of the dominating set [5, page 190], [4] of the graph G = (v,7?) 
defined in the following way: 

- to each variable of {j/j, and to each possible value that can be taken by at 
least one variable of {(/i, we associate a vertex of the set V , 

- if a value v can be taken by a variable {/, (l < i < n) we create an edge that starts 
from V and ends at f/, ; we also create an edge between each pair of values. 

Fig. 1 shows the execution of the previous algorithm on a set of 9 variables 
{t/i,..,t/ 9 } with the respective domain 0..3, 0..1, 1..7, 1..6, 1..2, 3. .4, 3. .3, 4. .6 and 4. .5. 
Each variable corresponds to a given column and each value to a row. Values that do 
not belong to the domain of a variable are put in black, while intervals low. .up that 
are produced by the algorithm (see lines 4,5) are dashed. In this example the 
computed minimum number of distinct values is equal to 3. 
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We now give the algorithm: 

1 ndistinct : =1 ; reinit : =TRUE ; i:=l; 

2 WHILE i<n DO 

3 IF NOT reinit THEN i:=i+l END IF ; 

4 1 ^ reinit OR low^ min(t/,)EI THEN low : = min(f/;) END IF ; 

5 IF reinit up max(t7,-) THEN up : = max(t7,-) ENDIF ; 

6 reinit : = { low up) ; 

7 ^ reinit THEN ndistinct : =ndistinct+l ENDIF ; 

8 ENDWHILE ; 

Alg. 1. Computing the minimum number of distinct values 



8 

7 

6 

5 

4 

3 

2 

1 

0 



Algorithm 1 partitions the set of variables in ndistinct groups of 

consecutive variables by starting a new group each time reinit is set to value true 
(see line 6 ). If for each group we consider the variable with the smallest maximum 
value and the l^est minimum value in case of tie, then we have ndistinct pairwise 
non-intersecting variables. From this fact we derive that we have a valid lower 
bound. In the example of Fig. I we have the three following groups Ux,U 2 ,U^,U 
and U(,,Uj and U^,Ug . The three pairwise non-intersecting variables are variables 
U 2 , t /7 and Ug ■ The lower bound obtained by algorithm 1 is sharp when for each 

group of variables there is at least one value in common. This is for example the case 
when each domain variable consists of one single interval of consecutive values. Note 
that the same algorithm works also if the set of variables {t/i,..,{/„} is sorted on 
decreasing maximum value. The algorithrrj^ has a complexity 0{n) where n is the 
number of variables. 




Cl U2 C3 C4 C5 Cg C7 Cg C9 

Fig. 1. Generated intervals 



* Throughout the algorithms of this paper, the evaluation of boolean expressions is performed 
from left to right in a lazy way. This explains why low does not need to be initialized. 

^ Two domain variables ai'e called non-intersecting variables when they don’t have any value 
in common. 

We did not include the sorting phase of the variables within algorithm 1 since, in Sect. 5, we 
call this algorithm several times on different parts of a given array of variables sorted on their 
decreasing maximum value. 



216 



N. Beldiceanu 



1 ndist:=l; 

2 reinit : =TRUE; 

3 i:=l; 

4 start_previous_group : =1 ; 

5 WHILE (reinit AND i^) OR ( ( NOT reinit) AND i + 1^) ^ 

6 ^ NOT reinit THEN i:=i+l ENDIF ; 

7 ^ reinit OR low ^ min(t/,- ) THEN low : = min(f/j ) ENDIF ; 

8 ^ reinit OR up max(t/J THEN up : = max(f/j) ENDIF ; 

9 reinit : = ( low >“ up) ; 

10 ^ reinit OR i=n THEN 

11 kinf [ndist] : = min_item ; ksup [ndist] : = max_item ; 

12 ^ reinit THEN end_previous_group : =i- 1 

ELSE end_previous_group : =i ENDIF ; 

13 FOR j : =start_previous_group TO end previous group DO 

14 below_current_group : = ( NOT reinit) ^ max(?7^] -< min(t//) ; 

15 ^ below_current_group 

AND min(f/y] ^ kinf [ndist] THEN kinf [ndist] : = min(f/y ] ENDIF ; 

16 ^ max(f/j] ^ksup[ndist] THEN ksup[ndist] : = max(f/j] ENDIF ; 

17 ENDFOR ; 

18 start_previous_group : =i ; 

19 ENDIF ; 

20 ^ reinit THEN ndist : =ndist+l ENDIF ; 

21 ENDWHILE ; 

22 ndist>ndistinct THEN FAIlHl 

23 ELSE IF ndist=ndistinct THEN 

24 adjust minimum values of to kinf [1] ; 

25 adjust maximum values of to ksup [ndistinct] ; 

26 FOR j:=l TO ndistinct-1 TO 

27 remove intervals of values ksup [j ] +1 .. kinf [j +1] -1 from ; 

28 ENDFOR ; 

29 ENDIF ; 

Alg. 2. Pruning for avoiding to exceed the maximum number of allowed distinct values 

Finally we make a remark that will be used later on, in order to shrink domains. Let 
Ui be a subset of variables such that intervals min(t// )..max(f// 

..max (7; do not pairwise intersect. If at least one variable of 

' ^ndistinct ' ' ‘'ndistinct ' 

Ui,..,U„ takes a value that does not belong to the union of intervals 
min t/; ..max i/; ... ,min f/,- ..max f/,- , then the minimum number of 

distinct values in Uy,..,U„ will be strictly greater than the quantity ndistinct 
returned by the algorithm. This is because we would get ndistinct+i pairwise 
non-intersecting variables: the “ndistinct” t/,-^ ,..,C/^. variables, plus the 



'* FAIL indicates that the constraint cannot hold and that we therefore exit the procedure; for 
simplicity reason we omit the fail in lines 24, 25 and 27, but it should be understand that 
adjusting the minimum or the maximum value of a variable, or removing values from a 
variable could also generate a fail. 
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additional variable that we fix. In the example of Fig. 1, we can remove from 
variables Ux,..,U^ all values that do not belong to 

min(t/2)..max((/2)umin(t/7)..max(f/7)umin(t/9)..max(f/9) = {o,l,3,4,5}, namely {2,6,7} if 
we don’t want to have more than three distinct values. But we can also remove all 
values that do not belong to 

min (t/5 ). . max (t/5 ) U min (1/7 ). . max (t/ 7 ) u min (t/g ). . max (t/9) = {l,2,3,4,5}, namely {o,6,7}. 
We show how to modify algorithm 1 in order to get the values to remove if one wants 
to avoid having more than ndistinct distinct values. The new algorithm uses two 
additional arrays kinf[i..n] and ksup[i..n] for recording the lower and upper 
limits of the intervals of values that we don’t have to remove. These intervals will be 
called the kernel of t/i,..,t/„ . 

The complexity of lines 1 to 21 is still in 0 { n ), while the complexity of lines 22 to 
29 is proportional to the number of values we remove from the domain of variables 
If we run algorithm 2 on the example of Fig. 1, we get three intervals 

kinf [1] ..ksup[i], kinf [2] ..ksup[2] and kinf [3] ..ksup[3] that respectively 
correspond to 1..1, 3. .3 and 4. .5. The lower and upper limits of interval 1..1 were 
respectively obtained by the minimum value of 1/5 (see lines 14,15: 1/5 is a variable 

for which max({/5)<min({/g) = 3) and the maximum value of U 2 (see line 16). From 
this we deduce that, if we don’t want to have more than three distinct values, all 
variables Ui,..,Ug should be greater than or equal to 1, less than or equal to 5, and 

different from 2. 



4 Computing a Lower Bound of the {r+i)th Smallest Distinct Value 
of a Set of Variables 

When r is equal to 0 we scan the variables and return the associated minimum value. 
When r is greater than 0, we use the following greedy algorithm which successively 
produces the r + l smallest distinct values by starting from the smallest possible value 
of a set of variables (f/i, At each step of algorithm 3 we extract one variable 

from according to the following priority rule: we select the variable with 

the smallest minimum value and with the minimum largest value in case of tie (line 
4). The key point is that at iteration k we consider the minimum value of all remaining 
variables to be at least equal to the (k-i) smallest value min produced so far (or to 
minjtem if k=i). This is achieved at line 4 of algorithm 3 by taking the maximum 
value between min(l7 ) and min. 

Table 2 shows for r=6 and for the set of variables with the respective 

domain 4. .9, 5. .6, 0..1, 3. .4, 0..1, 0..1, 4.. 9, 5. .6, 5. .6 the state of k, U , min and s just 
before execution of the statement of line 10. From this we find out that the (6 + 1) 
smallest distinct value is greater than or equal to 7. 
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1 min : = min_item ; SU : = ; k:=l; s:=r; 

2 TO 

3 ^ k>n THEN BREAK ENDIF ; 

4 U :=a. variable of SU with the smallest value for 

maximum ( min (f/) , min) , and the smallest value for max(f/) 
in case of tie; 

5 SU :=SU - {i7} ; 

6 k=l OR min^ max (17 ) THEN 

7 k=l OR min^min((7) THEN min: = min({/) 

ELSE min : = next(min) ENDIF ; 

8 S : = S - 1 ; 

9 ENDIF ; 

10 k:=k+l; 

11 WHILE s>0; 

12 S=-l THEN RETURN min ELSE RETURN max_item ENDIF ; 

Alg. 3. Computing the (r+l)^^ smallest distinct value 



Table 2. State of the main variables at the different iterations of algorithm 3 



k 


1 


2 


3 


4 


5 


6 


7 


8 


9 


u 


0. .1 


0 . . 1 


0 . . 1 


3 . .4 


4 . . 9 


5 . . 6 


5 . . 6 


5. .6 


4 . . 9 


min 


0 


1 


1 


3 


4 


5 


6 


6 


7 


s 


5 


4 


4 


3 


2 


1 


0 


0 


-1 



In order to avoid the rescanning implied by line 4, and to have an overall 
complexity of O(n.lgn), we rewrite algorithm 3 by using a heap which contain 
variables Ui,..,U„ sorted in increasing order of their maximum. 

1 let be variables sorted in increasing order 

of minimum value ; 

2 creates an empty heap; k:=l; s:=r; 

3 DO 

4 extract from the heap all variables S for which: 

max(s)^min vmax(s) = min; 

5 k>n AND empty heap THEN BREAK ENDIF ; 

6 IF empty heap THEN min : = min(S,t ) ELSE min : = next(min) ENDIF ; 

7 WHILE k^ AND min(5,t ) =min DO push S]^ on the heap; k:=k+l; ENDWHILE ; 

8 extract from the heap variable with smallest maximum value; s:=s-l; 

9 WHILE s>0; 

10 S=-l THEN RETURN min ELSE RETURN max_item ENDIF ; 

Alg. 4. Simplified version of Alg. 3 for computing the (r+l)^^ smallest distinct value 



5 Pruning of m 

The minimum value of M corresponds to the smallest (r + l)^^ item that can be 
generated from the values of variables . Note that, since all variables that take 
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the same value will have the same rank according to the ordering relation 9t , we have 
to find r + 1 distinct values. For this purpose we use algorithm 4. Note that the 
previous algorithm will return max_item if there is no way to generate r + 1 distinct 
values; since this is the biggest possible value, this will fix M to value max_item . 

When r is equal to 0, the maximum value of M is equal to the smallest maximum 
value of variables . When r is greater than 0, the maximum value of M is 

computed in the following way by the next three methods. We denote 
min_nval(t/i,..,t/„) a call to the algorithm that computes a lower bound of the 
minimum number of distinct values of a set of variables {t/i,..,t/„,}(see algorithm 1 of 
Sect. 3). We sort variables V'i,..,V'„ in decreasing order on their maximum value and 
perform the following points in that given order: 

- if none of can take max_item as value, and if there are at least r + 1 distinct 

values for variables Fi,..,F„ ii.e. min_nval(Fi,..,F„)> r + 1 ) then we are sure that the 
(r + l)f^ item will be always defined; so we update the maximum value of M to 
prev(max_item) . 

- if the maximum value of M is less than max_item , we make a binary search (on 
V'i,..,V'„ sorted in decreasing order on their maximum value) of the largest suffix 

for which the minimum number of distinct values is equal to r + 1 ; finally, we 
update the maximum value of M to the maximum value of the variables of the 
previous largest suffix. This is a valid upper bound for M , since taking a larger 
value for the smallest (r + l)^^ distinct value would lead to at least r + 2 distinct 
values. Since algorithm 1 is called no more than \gn times, the overall complexity 
of this step is 0 ( n.\gn ). 

- When the largest suffix founded at the previous step contains all variables V'i,..,V„ 
we update the maximum value of M to the maximum value of the kernel of 
iq, . This is the value ksup [ndist] computed by algorithm 2. This is again a 

valid upper bound since taking a larger value for M would lead to r + 2 distinct 
values: by definition of the kernel (see Sect. 3), all values that are not in the kernel 
lead to one additional distinct value. 

Let us illustrate the pruning of the maximum value of M on the instance 
min_n(M, 1 ,{V),..,V 9 }) , with V'i,..,V 9 having respectively the following domains 0..3, 
0..1, 1..7, 1..6, 1..4, 3. .4, 3. .3, 4. .6 and 4. .5, and M having the domain 0..9. By sorting 
V'i,..,V '9 in decreasing order on their maximum value we obtain 
■ We then use a binary search that starts from interval 1..9 
and produces the following sequence of queries: 

- inf=l, sup=9, mid=5; min_nval(l/ 5 ,V' 6 ,V'i, 127 , 1 / 2 ) returns 2 that is less than or equal 
to r + l = 2, 

- inf=l, sup=5, mid=3; min_nval(l/g, 1 / 9 , 125 , 126 , 1 / 1 , 1 / 7 , 1 / 2 ) returns 3 that is greater than 
r+l=2, 

- inf=4, sup=5, mid=4; vmn_n\a\{yq,Vs,V^,Vi,V-i Mi) returns 3 that is greater than 
r + l = 2. 
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From this, we deduce that the maximum value of M is at most equal to the 
maximum value of variable V 5 , namely 4. 

Finally, since variable M will be equal to one of the variables or to value 

max_item , we must remove from M all values different from max_item , that do not 
belong to any variable of If only one single variable of Vi,..,V„ has some 

values in common with M , and if M cannot take max_item as value, then this 
variable should be unified|i^with M . 

6 Pruning of Vy,..,v„ 

Pruning of variables Vi,..,V„ is achieved by using the following deduction rules: 

• Rule 1: If n-r-\ variables are greater than M then the remaining variables are 
less than or equal to M 

• Rule 2: If M ^ max_item then we have at least r + 1 distinct values for the variables 
of Vi,..,V„ that are less than or equal to M . 

• Rule 3: We have at most r + 1 distinct values for the variables of Vi,..,V„ that are 
less than or equal to M . 

• Rule 4: If M ^ max_item then we have at least r distinct values for the variables of 
V'i,..,V'„ that are less than M . 

• Rule 5: We have at most r distinct values for the variables of that are less 

than M . 

Rules 2 and 4 impose a condition on the minimum number of distinct values, while 
rules 3 and 5 enforce a restriction on the maximum number of distinct values. In order 
to implement the previous rules we consider the following subset of variables of 
V, V • 

- V< is the set of variables V, that are for sure less than M (i.e. max(v, )<min(M)), 

- V< is the set of variables V, that are for sure less than or equal to 
M (i.e. max(V,;)<min(M)), 

- is the set of variables V, that are for sure greater than M 
{i.e. min (V,:) > max (m)X 

- F> is the set of variables V, that may be less than or equal to M 
{i.e. min(y,.)< max(M ) ), 

- l/> is the set of variables V, that may be less than M {i.e. min(v', )<max (m)). 

Some languages such as Prolog for instance offer unification as a basic primitive. If this is 
not the case then one has to find a way to simulate it. This can be achieved by using equality 
constraints. 

If there are not r + I distinct values among variables V[,..,V'„ then variable M takes by 
definition value max_item (see Sect. 2) and therefore all variables Vi,..,V„ are less than or 
equal to M . 
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- is the set of variables 1/ that may be greater than or equal to M 
(i.e. max (V^,) > min (m)X 

- V< is the set of variables V,- that may be greater than M (i.e. max(V', )> min(M)). 

|V^| denotes the number of variables in . We also introduce the four following 
algorithms that take a subset of variables V of and an integer value vmax as 

arguments, and perform the respective following task: 



- min_nval(V) is a lower bound of the minimum number of distinct values of the 
variables of V ; it is computed with algorithm 1, 

- min_nval_prune(V,vmi«) removes from variables all values less than or 

equal to vmin that do not belong to the kernel of V ; it uses algorithm 2, 

- max_matching(V, vmax) is the size of the maximum matching of the following 
bipartite graph: the two classes of vertices correspond to the variables of V and to 
the union of values, less than or equal to a given limit vmax , of the variables of V ; 
the edges are associated to the fact that a variable of V takes a given value that is 
less than or equal to vmax ; when we consider only intervals for the variables of V , 
it can be computed in linear time in the number of variables of V with the 
algorithm given in [9] . 

- matching_prune(V, vmax) removes from the bipartite graph associated to V and 
vmax all edges that do not belong to any maximum matching (this includes values 
which are greater than vmax ); for this purpose we use the algorithm given in [3] or 

[7]. 

We now restate the deduction rules in the following way: 



Rule 1: IF |F>| = M-r-l then Vl/-eF^ : max (f,)^ next (max (m)) 

Rule 2: if max(M ) ^ max_item and max_matching(F>,max (M)) <r+l THEN fail 
ELSE IF max(M ) ^ max_item and max_matching(v>,max (M)) = r + 1 THEN 

matching_prune(F> , max(M )) 

Rule 3: if min_nval(F< ) > r + 1 then fail 

ELSE IF min_nval(F< ) = r + 1 then min_nval_prune(F<,min(M)) 



Rule 4: if max(M)^max_item and max_matching(v>,prev(max ("))) <r THEN fail 
IF max(M ) ^ max_item and max_matching(v>,prev(max (A^))) = r THEN 

matching_prune(v> , prev(max(M ))) 



ELSE 



Rule 5: if min_nval(F<. ) > r then fail 

ELSE IF min_nval(F^ ) = r then min_nval_prune(F^,prev(min (m))) 
We give several examples of application of the previous deduction rules. 
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min_n(M :2..3,r:l, {Vj : 0..9,V2 : 4..9,K3 : 0..9}) : 



Rule 1: Since ={1^2} and |R>| = M-r-l = 3 -l-l = l , we have: 



Jmax (v'l) < max (m) = 3 
[max (R3)< max (m) = 3‘ 



min_n(M :4..6, /-:3 , {Kj : 3..4,R2 : 3..4,V'3 : 3..4, V4 : 6..9,K5 :7..9}): 

Rule 2: No solution since = {Vj , V2 , V3 , V4 } and max_matching(v^ ,6)=3<r + l = 4. 

min_n(M :1..2, r:2, {Vi : 0..1,K2 : 0..3,K3 : 0..1,R4 : 3..7}) : 

Rule 2: Since ={1^1. 1^2’ ^3} max_matching(R>,2)=3 = r + \ , we have: R2 = 2 . 

min_n(M :6..7 , {Kj : 0.. 1,^2 : 1-2, R3 : 3. .4,^4 : 0.. 3,^5 : 4..5,R6 : 5..6,K7 : 2..9}) : 

Rule 3: No solution since V< ={Vi,V2>V3,V4,V5,V6} and min_nval(v< ) = 3>r + l = 2 
( min_nva i(i"<) is equal to 3 since intervals min(v'i)..max min (I's)- .max (Rj) and 

min(V6)..max(K6) do not pairwise intersect). 

min_n(M :6..7, r:2, {Vj : 0..1,K2 : 1-2, V3 : 3..4,R4 : 0..3,V5 : 4..5,K6 : 5..6,K7 : 2..9}) : 

Rule 3: Since V^={/y,V2,V^,V^,V^,V^} and min_nval(K< ) = 3 = r + 1 and because 
intervals min .max min (I's)- .max(v'3) and min(Vg)..max(Rg) do not pairwise 

intersect, we can remove all values, less than or equal to min(M) = 6 , that do not 
belong to min(V'i)..max(V))umin(V'3)..max(V'3)umin(Rg)..max (Rg) = {0,l}u{3,4}u{5,6}; 
therefore we remove value 2 from V2 , V4 and ■ 

min_n(M :4..6, r:3, {Vj : 1.. 2,^2 : 1-2, K3 : 1-2, V4 : 6..9,K5 :7..9}): 

Rule 4: No solution since V> ={V),V2.V3} and max_matching(v>,5)=2<r = 3 . 
min_n(M :4..6, r:3, {Vj : 1.. 2,^2 : 1-3, V3 : 1-2, V4 : 6..9,K5 :7..9}) : 

Rule 4: Since V> ={Vi,V2 >1^3} and max_matching(v> ,5)= 3 = r , we have: V2 = 3 . 
min_n(M :5..6,r:l, {Ri : O-hRj : 1-2,R3 : 3..4,V4 : 5..9,K5 :0..9}): 

Rule 5: No solution since V^=\yi,V2,V2,} and min_nval(R< ) = 2 > r = 1 ( min_nval(R<- ) 
is equal to 2 since intervals min(V))..max(V)) and min(V'3)..max(V3) are disjoint). 

min_n(M :5..6, r:2 , {Kj : 0..1,K2 : 1..2,R3 : 3..4,R4 : 5..9,R5 : 0..9}): 

Rule 5: Since ={f).V2’1^3} min_nval(t/< ) = 2 = r and because the two intervals 

min(V))..max(V)) and min(V'3)..max(V'3) are disjoint, we can remove all values, strictly 
less than min(M ) = 5 , that do not belong to 

min .max(v'i)umin(V3)..max(v'3) = {o,l}u{3,4} ; therefore we remove value 2 from V2 
and V5 . In addition, since the two intervals min (Vil .max (V2) and min (V's) ..max (1^3) are 
disjoint, we can also remove value 0 from Vi and . 
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7 The Number of Distinct Values Constraint 

Tht number of distinct values con^Xmmth&sXhe, form nvalue(D,{Vi,..,V„}) where D is 
a domain variable and \yi,..,V„] is a collection of variables. The constraint holds if D 
is the number of distinct values taken by the variables . This constraint was 

introduced in [6] and in [1], but a propagation algorithm for this constraint was not 
given. Note that the nvalue constraint can be broken up into two parts: 

- at least min(£)) distinct values must be taken, 

- at most max(D) distinct values may be taken. 

While the first part was already studied in [8, page 195], nothing was done for the 
second part. The nvalue constraint generalizes several more simple constraints like 
the alldijferent and the notallequaj^ constraints. The purpose of this section is to 
show how to reduce the minimum and maximum values of D and how to shrink the 
domains of V],..,!^,, : 

- since the minimum value of D is the minimum number of distinct values that will 

be taken by variables one can sort variables on increasing 

minimum value and use algorithm 1 in order to get a lower bound of the minimum 
number of distinct values. Then the minimum of D will be adjusted to the 
previous computed value. 

- since the maximum value of D is the maximum number of distinct values that can 
be taken by variables Vj,..,V„ , one can use a maximum matching algorithm on the 

following bipartite graph: the two classes of vertices of the graph are the variables 
Vj,..,F„ and the values that can be taken by the previous variables. There is an edge 
between a variable V,- (l < i < n) and a value val if V; can take value val . The 

maximum value of D will be adjusted to the size of the maximum matching of the 
previous bipartite graph. 

- the following rules, respectively similar to rules 2 and 3 of Sect. 6, are used in 
order to prune the domain of variables Vj,..,V„ : 

I F max_matching (Vj , . . , V„ , M AXINT ) = min [d] THEN 

matching_prune (Fi , . . , V „ , M AXINT ) , 

IF min_nval(F] ,..,F„ ) = max [d] THEN min_nval_prune(Fi,..,F„, MAXINT) . 

The first rule enforces to have at least min(Z)) distinct values, while the second 
rule propagates in order to have at most max(D) distinct values. 

Finally, we point out that one can generalize the number of distinct values 
constraint to the number of distinct equivalence classes constraint family by requiring 
to count the number of distinct equivalences classes taken by the values of variables 
Fi,..,F„ according to a given equivalence relation. 



'"'The notallequaH|Vi,..,V„|j constraint holds if the variables F[,..,F„ are not all equal. 
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8 Conclusion 

We have presented generic propagation rules for the minimum and nvalue constraints 
families and two algorithms that respectively compute a lower bound for the 
minimum number of distinct values and for the (r + l) th smallest distinct value. These 
algorithms produce a tight lower bound when each domain consists of one single 
interval of consecutive values. However there should be room for improving these 
algorithms in order to try to consider holes in the domains of variables. One should 
also provide for small values of r an algorithm for computing the r th smallest 
distinct value of a set of intervals for which the complexity depends of r . We did not 
address any incremental concern since it would involve other issues like maintaining 
a list of domain variables sorted on their minimum, or like regrouping all propagation 
rules together in order to factorize common parts. 

Acknowledgements. Thanks to Mats Carlsson, Per Mildner and Emmanuel Poder for 
useful comments on an earlier draft of this paper. The author would also like to thank 
the anonymous referees for their insightful reviews. 
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Abstract. The Stable Marriage problem (SM) is an extensively-studied 
combinatorial problem with many practical applications. In this paper 
we present two encodings of an instance I of SM as an instance J of 
a Constraint Satisfaction Problem. We prove that, in a precise sense, 
establishing arc consistency in J is equivalent to the action of the estab- 
lished Extended Gale/Shapley algorithm for SM on I. As a consequence 
of this, the man-optimal and woman-optimal stable matchings can be 
derived immediately. Furthermore we show that, in both encodings, all 
solutions of I may be enumerated in a failure- free manner. Our results 
indicate the applicability of Constraint Programming to the domain of 
stable matching problems in general, many of which are NP-hard. 



1 Introduction 

An instance of the classical Stable Marriage problem (SM) comprises n men 
and n women, and each person has a preference list in which they rank all 
members of the opposite sex in strict order. A matching M is a bijection between 
the men and women. A man rrii and woman wj form a blocking pair for M if 
nii prefers Wj to his partner in M and Wj prefers mi to her partner in M. 
A matching that admits no blocking pair is said to be stable, otherwise the 
matching is unstable. SM arises in important practical applications, such as the 
annual match of graduating medical students to their first hospital appointments 
in a number of countries (see e.g. m- 

Every instance of SM admits at least one stable matching, which can be 
found in time linear in the size of the problem instance, i.e. O(n^), using the 
Gale/Shapley (GS) algorithm [4]. An extended version of the GS algorithm - 
the Extended Gale/Shapley (EGS) algorithm Section 1.2.4] - avoids some un- 
necessary steps by deleting from the preference lists certain (man, woman) pairs 
that cannot belong to a stable matching. The man-oriented version of the EGS 

* This work was supported by EPSRC research grant GR/M90641. 

T. Walsh (Ed.): CP 2001, LNCS 2239, pp. 225- 12391 2001. 

(c) Springer- Verlag Berlin Heidelberg 2001 
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Men’s lists 


Women’s lists 


1: 1 3 6 2 4 5 
2: 4 6 1 2 5 3 
3: 1 4 5 3 6 2 
4: 6 5 3 4 2 1 
5: 2 3 1 4 5 6 
6: 3 1 2 6 5 4 


1: 1 5 6 3 2 4 
2: 2 4 6 1 3 5 
3: 4 3 6 2 5 1 
4: 1 3 5 4 2 6 
5: 3 2 6 1 4 5 
6: 5 1 3 6 4 2 



Men’s lists 


Women’s lists 


1: 1 


1: 1 


2: 2 


2: 2 


3: 4 


3: 4 6 


4: 6 5 3 


4: 3 


5: 5 6 


5: 6 4 5 


6: 3 6 5 


6: 5 6 4 



(a) 



(b) 



Fig. 1. (a) An SM instance with 6 men and 6 women; (b) the corresponding GS-lists. 



algorithm involves a sequence of proposals from the men to women, provisional 
engagements between men and women, and deletions from the preference lists. 
At termination, the reduced preference lists are referred to as the MGS-lists. 
A similar proposal sequence from the women to the men (the woman-oriented 
version) produces the WGS-lists, and the intersection of the MGS-lists with the 
WGS-lists yields the GS-lists |S1 p.l6]. An important property of the GS-lists 
m Theorem 1.2.5] is that, if each man is given his first-choice partner (or equiv- 
alently, each woman is given her last-choice partner) in the GS-lists then we 
obtain a stable matching called the man-optimal stable matching. In the man- 
optimal (or equivalently, woman-pessimal) stable matching, each man has the 
best partner (according to his ranking) that he could obtain, whilst each woman 
has the worst partner that she need accept, in any stable matching. An analogous 
procedure, switching the roles of the men and women, gives the woman-optimal 
(or equivalently, man-pessimal) stable matching. 

An example SM instance I is given in Figure [fl together with the GS-lists 
for I. (Throughout this paper, a person’s preference list is ordered with his/her 
most-preferred partner leftmost.) There are three stable matchings for this in- 
stance: {(1,1), (2,2), (3,4), (4,6), (5,5), (6,3)} (the man-optimal stable matching); 
{(1,1), (2,2), (3,4), (4,3), (5,6), (6,5)} (the woman-optimal stable matching); and 
{(1,1), (2,2), (3,4), (4,5), (5,6), (6,3)}. 

SMI is a generalisation of SM in which the preference lists of those involved 
can be incomplete. In this case, person p is aeceptable to person q ii p appears on 
the preference list of q, and unaeceptable otherwise. A matching M in an instance 
/ of SMI is a one-one correspondence between a subset of the men and a subset 
of the women, such that (m, w) G M implies that each of m and w is acceptable 
to the other. In this setting, a man m and woman w form a blocking pair for M if 
each is either unmatched in M and finds the other acceptable, or prefers the other 
to his/her partner in M. As in SM, a matching is stable if it admits no blocking 
pair. (It follows from this definition that, from the point of view of finding stable 
matchings, we may assume without loss of generality that p is acceptable to q if 
and only if q is acceptable to p.) A stable matching in I need not be a complete 
matching. However, all stable matchings in I involve exactly the same men and 
women [^. It is straightforward to modify the Extended Gale/Shapley algorithm 
to cope with an SMI instance 0 Section 1.4.2]. A pseudocode description of the 
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assign each person to be free; 

while some man m is free and m has a nonempty list loop 
w := first woman on m’s list; {m ‘proposes’ to ui} 
if some man p is engaged to w then 
assign p to be free; 

end if; 

assign m and w to be engaged to each other; 
for each successor p of m on w’s list loop 
delete the pair {p, w}; 

end loop; 
end loop; 



Fig. 2. The man-oriented Extended Gale/Shapley algorithm for SMI. 

man-oriented EGS algorithm for SMI is given in Figure [21 (the term delete the 
pair {p,w} means that p should be deleted from w’s list and vice versa.) The 
woman-oriented algorithm is analogous. Furthermore, the concept of GS-lists 
can be extended to SMI, with analogous properties. 

The Stable Marriage problem has its roots as a combinatorial problem, but 
has also been the subject of much interest from the Game Theory and Economics 
community | 13| and the Operations Research community | 14| . In this paper we 
present two encodings of an instance I of SMI (and so of SM) as an instance J of 
a Gonstraint Satisfaction Problem (GSP). We show that Arc Gonsistency (AG) 
propagation [T] achieves the same results as the EGS algorithm in a certain sense. 
For the first encoding, we show that the GS-lists for I correspond to the domains 
remaining after establishing AG in J. The second encoding is more compact; 
although the arc consistent domains in J are supersets of the GS-lists, we can 
again obtain from them the man-optimal and woman-optimal stable matchings 
in I. We also show that, for both encodings, we are guaranteed a failure-free 
enumeration of all stable matchings in I using AG propagation (combined with 
a value-ordering heuristic in the case of the first encoding) in J. 

Our results show that constraint propagation within a GSP formulation of SM 
captures the structure produced by the EGS algorithm. We have also demon- 
strated the applicability of constraint programming to the general domain of 
stable matching problems. Many variants of SM are NP-hard mm, and the 
encodings presented here could potentially be extended to these variants, giving 
a way of dealing with their complexity through existing GSP search algorithms. 

The remainder of this paper is organised as follows. In Section [2] we present 
the first encoding, then prove the consequent relationship between AG propa- 
gation and the GS-lists in Section [3] the failure-free enumeration result for this 
encoding is presented in Section (H A second encoding, using Boolean variables, 
is given in Section and in Section |B] we show the relationship between AG 
propagation in this encoding and the man-optimal and woman-optimal stable 
matchings, together with the failure-free enumeration result. Section [ 7 ] contains 
some concluding remarks. 
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2 A First Encoding for SM and SMI 

In this section we present an encoding of the Stable Marriage problem, and 
indeed more generally SMI, as a binary constraint satisfaction problem. 

Suppose that we are given an SMI instance I involving men mi, m 2 , . . . , m„ 
and women wi,W 2 , ■ ■ ■ , Wn (it is not difficult to extend our encoding to the case 
that the numbers of men and women are not equal, but for simplicity we assume 
that they are equal). For any person q in I, PL{q) (respectively GS{q)) denotes 
the set of persons contained in the original preference list (GS-list) of q in I. 
For the purposes of exposition, we introduce a dummy man m„+i and a dummy 
woman into the SMI instance, such that, for each i, m* (respectively Wi) 

prefers all women (men) on his (her) preference list (if any) to Wn+i (m„+i). 

To define an encoding of / as a CSP instance J, we introduce variables 
xi,X 2 , ■ ■ ■ ,Xn corresponding to the men, and yi,y 2 , ■ ■ ■ ,yn corresponding to the 
women. For each i {1 < i < n), we let dom(xi) denote the values in variable Xi’s 
domain. Initially, dom(xi) is defined as follows: 

dom{xi) = {j : wj € PL{mi)} U {n + 1}. 

For each j (1 < j < n), dom{yj) is defined similarly. For each t (1 < f < n), 
let d™ = \dom{xi)\ and let d“ = \dom{yi)\. Intuitively, for 1 < i,j < n, the 
assignment Xi = j corresponds to the case that man m^ marries woman Wj, and 
the constraints of our encoding will ensure that Xi = j if and only if yj = i. 
Similarly, for 1 < t < n, the assignment Xi = n + 1 (respectively yi = n + 1) 
corresponds to the case that m^ (w,) is unmatched. It should be pointed out 
that, if the given SMI instance is an SM instance (i.e. every preference list is 
complete), then no variable will be assigned the value n + 1 in its domain in any 
stable matching. 

We now define the constraints between the variables to ensure that the so- 
lutions to the CSP correspond exactly to the stable marriages in /. Given any i 
and j (1 < i,j < n), the stable marriage constraint Xi/yj involving Xi and y^ is 
a set of nogoods which we represent by a x dj conflict matrix C. To make the 
structure of the conflict matrix clear, we describe it using four possible values 
for the element Ck,i of C, for any k,l (fc € dom{xi), I G dom{yj)), as follows. In 
a conventional conflict matrix, the values I and B are disallowed so would be 0, 
while the values A and S are allowed and so would be 1. 

A: Ckg = A when k = j and I = i, which Allows Xi = j (and yj = i). At most 
one element in C can ever contain the value A. 

I: Ck,i = I when either k = j and I ^ i or I = i and k yf j, i.e. the two pairings 
are Illegal, since either Xi = j and yj = I ^ i or yj = i and Xi = k ^ j. 

B: Ck,i = B when mt prefers Wj to Wk and Wj prefers rrii to to /. Any matching 
corresponding to the assignment Xi = k and yj = I would admit a Blocking 
pair involving to/ and Wj. 

S: Ckj = S for all other entries that are not A, I or B. The simultaneous 
assignments of Xi = k and yj = I are Supported. 
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The size of each conflict matrix is 0 {v?) and clearly there are 0 {n?) conflict 
matrices; consequently the overall size of the encoding is 0{n^). 
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Fig. 3. Conflict matrices for stable marriage constraints from the problem in Fignre 1 



Examples of different types of conflict matrices for stable marriage constraints 
Xi/yj are shown in Figure 0 for the SM instance of Figure [TJ In all cases, and 
henceforth in this paper, the values in XiS (respectively j/j’s) domain are listed 
in order down the rows (along the columns) according to m/s (wj’s) preference 
list, and a blank entry represents an S. Another type of conflict matrix can occur 
in an SMI instance: the value A does not occur in a conflict matrix Xijyj if rrii 
and Wj are unacceptable to each other, and the matrix is then filled with S’s. 

Figure[3^a) shows the conflict matrix for the stable marriage constraint x\jyi- 
The row and column of I’s, representing illegal marriages, intersect at the A entry, 
and the area to the right of and below A is filled with B’s, representing nogood 
assignments to Xi and 2/2 which would lead to toi and W2 being a blocking pair. 

Figure[^b) shows the conflict matrix for the stable marriage constraint a/e/j/a- 
Again the area with A at its top left corner is bounded by I’s and filled with B’s. 
However, the A is in the top row, since W3 is at the top of mg’s preference list. 
Consequently all values in the domain of 2/3 to the right of A are unsupported. 
Similarly, Figure | 3 |c) shows the conflict matrix for the stable marriage constraint 
a^a/ys, where m3 is at the top of wg’s preference list. The A entry is in the first 
column and all values in the domain of *3 below the A are unsupported. 

Enforcing AC on the instance of Figure 1 will delete the rows and columns 
from Figure [ 31 (b) and (c) corresponding to unsupported values. As shown in the 
next section, these deletions are equivalent to those done by the ECS algorithm. 

3 Arc Consistency and the GS-Lists 

In this section we prove that, if I is an SMI instance and J is a CSP instance 
obtained from I using the encoding of Section E] AC propagation in J essentially 
calculates the GS-lists of /0 The proof depends on two lemmas. The first shows 

^ Strictly speaking, we prove that, after AC propagation, for any i,j (1 < i,j < n), 
Wj € GS{mi) iff j £ dom{xi), and similarly rrii £ GS{wj) iff i £ dom(yj). 
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that the domains remaining after AC propagation, apart from the dummy val- 
ues, are subsets of the GS-lists. We prove this by showing that, when the EGS 
algorithm removes a value, so does the AC algorithm. The second proves that 
the GS-lists are subsets of the domains remaining after AC propagation. We do 
this by showing that the GS-lists correspond to arc consistent domains for the 
variables in J. 

Lemma 1. For a given variable Xi in J (1 < i "Fi n), after AC propagation, 

{wj : j S dom{xi)\{n + 1}} C GS{mi). 

A similar result holds for each variable pj < j < n). 

Proof. The GS-lists for / are obtained from the original preference lists in I 
by deletions carried out by either the man-oriented or woman-oriented EGS 
algorithms. We show that the corresponding deletions would occur from the 
relevant variables’ domains during AC propagation in J. The proof for deletions 
resulting from the man-oriented version is presented; the argument for deletions 
resulting from the woman-oriented version is similar. 

We prove the following fact by induction on the number of proposals z during 
an execution E of the man-oriented EGS algorithm (see Figure [2]) on I: for any 
deletion carried out in the same iteration of the while loop as the zth proposal, 
the corresponding deletion would be carried out during AC propagation. Clearly 
the result is true for z = 0. Now assume that z = r > 0 and the result is true for 
all z < r. Suppose that the rth proposal during E consists of man mi proposing 
to woman Wj . At this point of E, we may use the induction hypothesis to deduce 
that, at some point during AC propagation, the conflict matrix for the stable 
marriage constraint Xi/yj has a structure analogous to that of Figure |H(a), since 
Wj is at the top of mfs list. Now suppose that in E, during the same iteration 
of the while loop as the rth proposal, the pair {mk,Wj} is deleted. Then in J, 
all values in pj’s domain to the right of the entry A (including k and n -I- 1) 
are unsupported, and will be deleted when the constraint is revised during AC 
propagation. Subsequent revision of the constraint Xi^/pj will remove j from 
Xk’s domain, since k is no longer in pj’s domain and therefore the jth row of 
the conflict matrix for Xk/pj contains only I entries. Hence the inductive step is 
established. 

Consequently, any deletion of a value from a preference list by the man- 
oriented EGS algorithm will be matched by a deletion of a value from the domain 
of the corresponding CSP variable when AC is enforced. The same is true for the 
woman-oriented EGS algorithm. The end result is that the domains remaining 
after AC propagation, omitting the dummy value, are subsets of the GS-lists. □ 



Lemma 2. For each i (1 < i < n), define a domain of values dom{xi) for the 
variable Xi as follows: if GS{mi) 0, then dom{xi) = {j : Wj G GS(mi)}; 
otherwise dom{xi) = {n -I- 1}. The domain for each pj (1 < j < n) is defined 
analogously. Then the domains so defined are arc consistent in J . 
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Fig. 4. Four possible types of stable marriage constraints Xijyj 



(c) 



(d) 



Proof. Suppose that the variables Xi {1 < i < n) and 2 /j (1 < j < n) are assigned 
the domains in the statement of the lemma. To show that these domains are arc 
consistent, we consider an arbitrary constraint Xi/yj. There are six cases to 
consider: 

— Wj is at the top of mfs GS-list. Then rrii is at the bottom of Wj’s GS- 
list. Hence the constraint Xijyj has a structure similar to that of Figure 
Hb). Every row or column has at least one A or S and the constraint is arc 
consistent. 

— Wj is at the bottom of m^’s GS-list. Then rrii is at the top of Wj's GS-list. 
Hence the constraint Xijyj has a structure similar to that of the transpose 
of Figure[H(b) and is arc consistent. 

— Wj is in rrii’s GS-list, but is not at the top or bottom of that list. Then the 
constraint Xijyj has a structure similar to that of FigurelDJc) (i.e. every row 
or column has at least one A or S), and is again arc consistent. 

— Wj GS{mi), but Wj € PL{mi) and GS{mi) ^ 0. Then ^ GS{wj). The 
pair {rrii, Wj} were deleted from each other’s original lists by either the man- 
oriented EGS algorithm (in which case all successors of rrii on Wj’s original 
list were also deleted) or the woman-oriented EGS algorithm (in which case 
all successors of Wj on mfs original list were also deleted). In either case, the 
constraint Xifyj has a structure similar to that of Figure U^d) and is again 
arc consistent, since all A,B and I entries have been removed, leaving only 
S entries. 

— Wj ^ PL{rrii), so Wj ^ GS{rrii), but GS{mi) 0. Then it is straightforward 
to verify that the constraint Xi/yj has a structure similar to that of Figure 
Ed) and is arc consistent. 

— GS{rrii) = 0. Then the constraint Xi/yj is a 1 x 1 conflict matrix with a 
single entry S and is arc consistent. 

Hence no constraint yields an unsupported value for any variable, and the set of 
domains defined in the lemma is arc consistent. □ 

The following theorem follows immediately from the above lemmas, and the 
fact that AG algorithms find the unique maximal set of domains that are arc 
consistent . 
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Theorem 3 Let I be an instance of SMI, and let J be a CSP instance obtained 
from I by the encoding of Section\^ Then the domains remaining after AC 
propagation in J are identical (in the sense of Footnote\^ to the G S -lists for I . 

Theorem [3] and the discussion of GS-lists in Section[T]show that we can find a 
solution to the CSP giving the man-optimal stable matching without search: we 
assign each Xi variable the most-preferred valued in its domain. Assigning the yj 
variables in a similar fashion gives the woman-optimal stable matching. In the 
next section, we go further and show that the CSP yields all stable matchings 
without having to backtrack due to failure. 

4 Failure-Free Enumeration 

In this section we show that, if / is an SM (or more generally SMI) instance and 
J is a CSP instance obtained from / using the encoding of Section ^ then we 
may enumerate the solutions of / in a failure-free manner using AC propagation 
combined with a suitable value-ordering heuristic in J. 

Theorem 4 Let I be an instance of SMI and let J be a CSP instance obtained 
from I using the encoding of Section\^ Then the following search process enu- 
merates all solutions in I without repetition and without ever failing due to an 
inconsistency: 

— AC is established as a preprocessing step, and after each branching decision 
including the decision to remove a value from a domain; 

— if all domains are arc consistent and some variable Xi has two or more values 
in its domain then search proceeds by setting Xi to the most-preferred value 
j in its domain. On backtracking, the value j is removed from Xi ’s domain; 

— when a solution is found, it is reported and backtracking is forced. 



Proof. Let T be the search tree as defined above. We prove by induction on 
T that each node of T corresponds to a CSP instance J' with arc consistent 
domains; furthermore J' is equivalent to the CS-lists /' for an SMI instance 
derived from I, such that any stable matching in I' is also stable in /. Firstly 
we show that this is true for the root node of T, and then we assume that this 
is true at any branching node u of T and show that it is true for each of the two 
children of u. 

The root node of T corresponds to the CSP instance J' with arc consistent 
domains, where J' is obtained from J by AC propagation. By Theorem El J' 
corresponds to the CS-lists in I, which we denote by I' . By standard properties 
of the CS-lists P, Theorem 1.2.5], any stable matching in /' is stable in /. 

Now suppose that we have reached a branching node u of T. By the induction 
hypothesis, u corresponds to a CSP instance J' with arc consistent domains, and 

^ Implicitly we assume that variable Xi inherits the corresponding preferences over the 
values in its domain from the preference list of man mi. 
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also J' is equivalent to the GS-lists I' for an SMI instance derived from I such 
that any stable matching in I' is also stable in I. As rt is a branching node of T, 
there is some * (1 < * < n) such that variable Xi’s domain has size > 1. Hence in 
T, when branching from node u to its two children vi and V 2 , two CSP instances 
J[ and J '2 are derived from J' as follows. In J(, Xi is set to the most-preferred 
value j in its domain and yj is set to i, and in J 2 , value j is removed from Xi’s 
domain and value i is removed from pj’s domain. 

We firstly consider instance J{. During arc consistency propagation in J(, 
revision of the constraint x^/yj, for any k such that Wj prefers to m^, forces 
I to be removed from the domain of Xk, for any I such that prefers Wj to wi 
(and similarly k is removed from the domain of yi). Hence after such revisions, 
J[ corresponds to the SMI instance I[ obtained from I' by deleting pairs of the 
form {rrii^wi} (where I yf j), {mk,Wj} (where k ^ i) and {mk,wi\ (where wj 
prefers m,k to m, and nik prefers Wj to wi). It is straightforward to verify that 
any stable matching in I[ is also stable in which is in turn stable in I by 
the induction hypothesis. At node v\, AC is established in J[, giving the CSP 
instance J” which we associate with this node. By Theorem |3] J" corresponds 
to the CS-lists /" of the SMI instance I[. By standard properties of the CS-lists 
m Section 1.2.5], any stable matching in /" is also stable in I[, which is in turn 
stable in / by the preceding argument. 

We now consider instance J' 2 , which corresponds to the SMI instance 
obtained from I' by deleting the pair {mi,Wj}. It is straightforward to verify 
that any stable matching in is also stable in which is in turn stable in I by 
the induction hypothesis. At node V 2 , AC is established in J' 2 , giving the CSP 
instance J'^ which we associate with this node. The remainder of the argument 
for this case is identical to the corresponding part in the previous paragraph. 

Hence the induction step holds, so that the result is true for all nodes of T. 
Therefore the branching process never fails due to an inconsistency, and it is 
straightforward to verify that no part of the search space is omitted, so that the 
search process lists all stable matchings in the SMI instance I. Finally we note 
that different complete solutions correspond to different stable matchings, so no 
stable matching is repeated. □ 

5 A Boolean Encoding of SM and SMI 

In this section we give a less obvious but more compact encoding of an SMI 
instance as a CSP instance. As in Section [21 suppose that I is an SMI instance 
involving men mi, m 2 , ■ ■ ■ , rrin and women wi,W 2 , ■ ■ ■ , Wn- For each i {1 < i < n) 
let denote the length of man mi’s preference list, and define l'^ similarly. 

To define an encoding of / as a CSP instance J, we introduce O(n^) Boolean 
variables and 0{n?) constraints. For each i,j (1 < i,j < n), the variables are 
labelled Xi^p for 1 < p < -I- 1 and yj^q for 1 < q < IJ + 1, and take only two 

values, namely T and F. The interpretation of these variables is: 

~ Xi^p = T iff man mi is matched to his or worse choice woman or is 

unmatched, for 1 < p < 
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Table 1. The constraints in a Boolean encoding of an SMI instance. 



1. Xi^i = r (1 < * < n) 

2- Vj,i = r (l<j<n) 

3. Xi^p = F ^ Xi,p+i = F (l<*<n, 2<p< I’T') 

4 - Vj,q = F ^ yj,g+i =F (1 < j < n,2 < q < 

5. Xi^p =T k, = F ^ Xi^p+i = T (1 < *, j < n) (*) 

6- Vj,q =T &i Xi,p = T ^ t/j. 5+1 = r (1 < i, j < n) (*) 

7. Xi,p = T ^ yj,g+i = F (1 < *, j < «-) (*) 

8- %.<3 = r ^ a;i.p+i = F (1 < *, j < «■) (*) 



“ Xi,p = T iff man is unmatched, for p = 1"^ + 1; 

— yj q = r iff woman Wj is matched to her or worse choice man or is 

unmatched, for 1 < g < Z“'; 

— Dj^q = T iff woman Wj is unmatched, for q = IJ + 1. 

The constraints are listed in Table |T] For each i and j (1 < i,j < n), the 
constraints marked (*) are present if and only if finds Wj acceptable; in this 
case p is the rank of Wj in m^’s list and g is the rank of rrii in Wj’s list. 

Constraints 1 and 2 are trivial, since each man and woman is either matched 
with some partner or is unmatched. Constraints 3 and 4 enforce monotonicity: 
if a man gets his p — 1*^ or better choice, he certainly gets his or better 
choice. For Constraints 5-8, let i and j be arbitrary (1 < i,j < n), and suppose 
that rrii finds Wj acceptable, where p is the rank of Wj in m^’s list and g is the 
rank of in Wj^s list. Constraints 5 and 6 are monogamy constraints; consider 
Constraint 5 (the explanation of Constraint 6 is similar). If mi has a partner no 
better than Wj or is unmatched, and Wj has a partner she prefers to rrii, then 
rrii cannot be matched to Wj, so rrii has his {p+ l)th-choice or worse partner, or 
is unmatched. Constraints 7 and 8 are stability constraints; consider Constraint 
7 (the explanation of Constraint 8 is similar). If rrii has a partner no better 
than Wj or is unmatched, then Wj must have a partner no worse than rrii, for 
otherwise rrii and wj would form a blocking pair. 

The next section focuses on AC propagation in J. 

6 Arc Consistency in the Boolean Encoding 

In this section we consider the effect of AC propagation on a CSP instance J 
obtained from an SMI instance I by the encoding of Section O We show that, 
using AC propagation in J, we may recover the man-optimal and woman-optimal 
stable matchings in I, and moreover, we may enumerate all stable matchings in 
/ in a failure-free manner. 

Imposing AC in J corresponds (in a looser sense than with the first encoding) 
to the application of the ECS algorithm in I from both the men’s and women’s 
sides. Indeed, we can understand the variables in terms of proposals in the ECS 
algorithm. That is, Xi^p being true corresponds to Wj’s p — 1*^ choice woman 
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Men’s lists 


Women’s lists 


1: 1 


1: 1 


2: 2 


2: 2 


3: 4 


3: 4 6 


4: 6 5 3 


4: 3 


5: 5 6 


5: 6 4 5 


6: 3 6 5 


6: 5 6 4 



Men’s lists 


Women’s lists 


1: 1 


1: 1 


2: 2 


2: 2 


3: 4 


3: 4 3 6 


4: 6 5 3 


4: 3 


5: 5 6 


5: 6 1 4 5 


6: 3 1 2 6 5 


6: 5 1 3 6 4 



(a) (b) 

Fig. 5. (a) The GS-lists for the SM instance of Figure 1, and (b) the possible partners 
remaining after AC is applied in the Boolean encoding. 



rejecting him after a proposal from a man she likes more. Consequently, the 
maximum value of p for which Xi^p is true gives the best choice that will accept 
rrij, and the lowest value of p such that is false gives the worst choice 

that he need accept (and the same holds for the pj^q variables). In general, we 
will prove that, for a given person p in I, AC propagation in J yields a reduced 
preference list for p which we call the Extended GS-list or XGS-list - this contains 
all elements in p’s preference list between the first and last entries of his/her GS- 
list (inclusive). For example. Figure [5]( a) repeats the GS-lists from Figure [1] and 
(b) shows the XGS-lists after AC is enforced. Note that in general, the XGS-lists 
may include some values not in the GS-lists. 

We now describe how we can use AC propagation in order to derive the 
XGS-lists for I. After we apply AC in J, the monotonicity constraints force 
the domains for the Xi^p variables to follow a simple sequence, for p = 1 to 

-I- 1. First, there is a sequence of domains {T}, then a sequence of domains 
which remain {T,F}, and a final sequence of domains {F}. The first sequence 
must be non-empty because Xi^i = T. If the middle sequence is empty then all 
variables associated with rrii are determined, while if the last sequence is empty 
it might still happen that fails to find any partner at all. More formally, 
let 7T (1 < 7T < -I- 1) be the largest integer such that dom{xi^T^) = {T}, and 

let 7 t' be the largest integer such that T G dom{xi^T^i). We will prove that, if 
7T = -I- 1 then the XGS-list of is empty; otherwise the XGS-list of 

contains all people on mi’s original preference list between positions tt and tt' 
(inclusive). Hence, in the latter case, a man mi’s XGS-list consists of the women 
at position p in his original list, for each p such that dom{xi^p) = {T,F} after AC- 
propagation, together with the woman in position tt in his original list. A similar 
correspondence exists between the women’s XGS-lists and the pj^q variables. 

As in Section the proof of this result uses two lemmas. The first shows that 
the domains remaining after AG propagation correspond to subsets of the XGS- 
lists, whilst the second shows that the XGS-lists correspond to arc consistent 
domains. 

Lemma 5. For a given i (1 < i < n), after AG propagation in J , let p be the 
largest integer such that dom{xi^p) = {T} and let p' be the largest integer such 
thatT G dom{xiy). If p Klf’ + l then all entries of mi’s preference list between 
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positions p and p' belong to the XGS-list of mi. A similar eorrespondenee holds 
for the women’s lists. 

Proof. The first entry on a man m’s XGS-list corresponds to the last woman 
(if any) to whom m proposed during an execution of the man-oriented EGS- 
algorithm. Similarly the last entry on a woman w’s XGS-list corresponds to the 
last man (if any) who proposed to w during an execution of the man-oriented 
EGS-algorithm. A similar correspondence in terms of the woman-oriented EGS- 
algorithm yields the first entry on a woman’s XGS-list and the last entry on 
a man’s XGS-list. We prove that, if a person q is missing from a person p’s 
XGS-list, then AG propagation reduces the domains of the variables relating 
to person p correspondingly. (We consider only the correspondences involving 
the man-oriented EGS-algorithm; the gender-reversed argument involving the 
woman-oriented EGS-algorithm yields the remaining cases.) 

It suffices to prove the following result by induction on the number of pro- 
posals 2 ; during an execution E of the man-oriented EGS algorithm (see Figure 
E) on /: if proposal z consists of man proposing to woman Wj, then Xi^t = T 
for 1 < t < p and yj^t = F for q < t < If + 1, where p denotes the rank of Wj in 
mfs list and q denotes the rank of in Wj’s list. 

Glearly the result is true for z = 0. Now assume that z = a > 0 and the result 
is true for all z < a. Suppose that the proposal during E consists of man mi 
proposing to woman Wj. Suppose that p is the rank of Wj in mfs list and q is the 
rank of mi in Wj’s list. Suppose firstly that p = 1. Then Xip = T by Gonstraint 
1, and pjp = F for q < t < If + 1 hy Gonstraints 7 and 4, since Xi^p’s value 
has been determined. Now suppose that p > 1. Then previously proposed 
to Wk, his p — l*^-choice woman (since proposes in his preference list order, 
starting with his most-preferred woman). By the induction hypothesis, Xip = T 
for 1 < t < p — 1. Woman Wk rejected mi because she received a proposal from 
some man mi whom she prefers to mi. Let r, s be the ranks of mi, mi in Wk’s list 
respectively, so that r < s. By the induction hypothesis, yk,t = F for t > r + 1. 
Thus in particular, yk^s = F, so that by Gonstraint 5, Xi^p = T, since the values 
of Xi^p-i and yk,s have been determined. Thus by Gonstraints 7 and 4, yjp = F 
for q < t < IJ + 1, since Xi^p’s value has been determined. This completes the 
induction step. 

Thus the proof of the lemma is established, so that the domains remaining 
after AG is enforced correspond to subsets of the XGS-lists. □ 



Lemma 6. For each i (1 < f < n), define a domain of values dom(xip) for 
the variables Xip {f < t < I'f’ + 1) as follows: if the XGS-list of mi is empty, 
dom(xip) = {r} for 1 < t < If^ 1. Otherwise, let p and p' be the ranks (in 

mi ’s preference list) of the first and last women on mi ’s XGS-list respectively. 
dom{xip) = {T} for I < t < p, dom{xip) = {F} for p'-\-l<t<l"’-\-l 
and dom(xip) = {T,F} for p < t < p' . The domains for each variable yjp 
{I < j < n, 1 < t < If -\- 1) are defined analogously. Then the domains so 
defined are arc consistent in J . 
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Proof. The proof of this lemma is along similar lines to that of Lemma |2] and 
involves showing that Constraints 1 to 8 in Tableware arc consistent under the 
assignments defined above; we omit the details for space reasons. □ 

The following theorem follows immediately from the above lemmas, and the 
fact that AC algorithms find the unique maximal set of arc consistent domains. 

Theorem 7 Let I be an instance of SMI, and let J he a CSP instance obtained 
from I by the encoding of Section\^ Then the domains remaining after AC 
propagation in J are identical (in the sense described before Lemma HTD to the 
XGS-lists for I. 

Hence Theorem |7] shows that we may find solutions to the CSP giving the man- 
optimal and woman-optimal stable matchings in I without search. 

We remark in passing that the SAT-based technique of unit propagation is 
strong enough for the same results to hold. This makes no theoretical differ- 
ence to the cost of establishing AC, although in practice we would expect unit 
propagation to be cheaper. This observation implies that a SAT solver applying 
unit propagation exhaustively, e.g. a Davis-Putnam program |2], will perform 
essentially the same work as an AC-based algorithm. 

As before, we show that solutions can be enumerated without failure. The 
results are better than before in two ways: first, maintenance of AC is much less 
expensive, and second, there is no need for a specific variable or value ordering. 

Theorem 8 Let I be an instance of SMI and let J be a CSP instance obtained 
from I using the encoding of Section\^ Then the following search process enu- 
merates all solutions in I without repetition and without ever failing due to an 
inconsistency: 

— AC is established as a preprocessing step, and after each branching decision 
including the decision to remove a value from a domain; 

— if all domains are arc consistent and some variable v has two values in its 
domain, then search proceeds by setting v to T , and on backtracking, to F ; 

— when a solution is found, it is reported and backtracking is forced. 

Proof. This result can be proved by an inductive argument similar to that used 
in the proof of Theorem |H The full details are omitted here for space reasons, 
but we indicate below the important points that are specific to this context. 

An SMI instance is guaranteed to have a stable matching, though not nec- 
essarily a complete one [HI Section 1.4.2] so the initial establishing of AC in J 
cannot result in failure. 

Branching decisions are only made when AC has been established, so The- 
orem [7| applies at branching points. If all domains are of size 1, we report the 
solution and terminate. Otherwise, we choose any variable with domain of size 2 
and create two branches with the variable set to T and F respectively. If the vari- 
able represents a man, setting it to T excludes the man-optimal matching, but 
the man-pessimal matching remains possible so this branch still contains a solu- 
tion. Conversely, setting the variable to F excludes the man-pessimal matching 
but leaves the man-optimal matching, so this branch also contains a solution. 
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The process of establishing AC never removes values which participate in 
any solution. As the branching process omits no part of the search space, the 
search process lists all solutions to the SMI instance. Finally we note that dif- 
ferent complete solutions correspond to different stable matchings, so no stable 
matching is repeated. □ 

We conclude this section with a remark about the time complexities of AC 
propagation in both encodings. In general, AC can be established in 0{e(F) time 
[I], where there are e constraints, each of arity r, and domain size is d. In the 
encoding of Section^, e = O(n^), d = 2 and r < 3. Thus AC can be established 
in 0{v?) time, which is linear in the size of the input. Hence this encoding 
of SM achieves the solution in 0{n?) time, which is known to be optimal [H]. 
We find it remarkable that such a strong result can be obtained without any 
special-purpose consistency algorithms. Furthermore, this result contrasts with 
the time complexity of AC propagation in the encoding of Section |2] in this case, 
e = O(n^), d — 0{n) and r = 2, so that AC can be established in O(n^) time. 

7 Conclusion 

We have presented two ways of encoding the Stable Marriage problem and its 
variant SMI as a CSP. The first is a straightforward representation of the problem 
as a binary CSP. We show that enforcing AC in the CSP gives reduced domains 
which are equivalent to the GS-lists produced by the Extended Gale-Shapley 
algorithm, and from which the man-optimal and woman-optimal matchings can 
be immediately derived. Indeed, we show that all solutions can be found without 
failure, provided that values are assigned in preference- list order. 

Enforcing AC using an algorithm such as AC-3 would be much more time- 
consuming than the ECS algorithm because of the number and size of the con- 
straints. A constraint propagation algorithm tailored to the stable marriage con- 
straint would do much better, but to get equivalent performance to ECS we 
should effectively have to embed ECS into our constraint solver. 

Nevertheless, the fact that we can solve the CSP without search after AC 
has been achieved shows that this class of CSP is tractable. Previous tractability 
results have identified classes of constraint graph (e.g. m or classes of constraint 
(e.g. E) which guarantee tractability. In the binary CSP encoding of SM, it is the 
combination of the structure of the constraints (a bipartite graph) and their type 
(the stable-marriage constraint) that ensures that we find solutions efficiently. 

The second encoding we present is somewhat more contrived, but allows AC 
to be established, using a general algorithm, with time complexity equivalent to 
that of the ECS algorithm. Although the arc consistent domains do not exactly 
correspond to the GS-lists, we can again find man-optimal and woman-optimal 
matchings immediately, and all stable matchings without encountering failure 
during the search. Hence, this encoding yields a CSP-based method for solving 
SM and SMI which is equivalent in efficiency to ECS. 

The practical application of this work is to those variants of SM and SMI 
which are NP-hard mm, or indeed to any situation in which additional 
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constraints on the problem make the EGS algorithm inapplicable. If we can 
extend one of the encodings presented here to these variants, we then have tools 
to solve them, since we have ready-made search algorithms available for CSPs. 

This paper provides a partial answer to a more general question: if we have 
a problem which can be expressed as a CSP, but for which a special-purpose 
algorithm is available, is it ever sensible to formulate the problem as a CSP? 
SM shows that it can be: provided that the encoding is carefully done, existing 
algorithms for simplifying and solving CSPs may give equivalent performance to 
the special-purpose algorithm, with the benefit of easy extension to variants of 
the original problem where the special-purpose algorithm might be inapplicable. 
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Abstract. Constraint programming systems provide software architec- 
tures for the fruitful interaction of algorithms for constraint propagation, 
branching and exploration of search trees. Search requires the ability 
to restore the state of a constraint store. Today’s systems use different 
state restoration policies. Upward restoration undoes changes using a 
trail, and downward restoration (recomputation) reinstalls information 
along a downward path in the search tree. In this paper, we present an 
architecture that isolates the state restoration policy as an orthogonal 
software component. Applications of the architecture include two novel 
state restoration policies, called lazy copying and batch recomputation, 
and a detailed comparison of these and existing restoration policies with 
“everything else being equal”. The architecture allows the user to opti- 
mize the time and space consumption of applications by choosing existing 
and designing new state restoration policies in response to application- 
specific characteristics. 



1 Introduction 

Finite domain constraint programming (CP(FD)) systems are software systems 
designed for solving combinatorial search problems using tree search. The history 
of constraint programming systems shows an increasing emphasis on software 
design, reflecting user requirements for flexibility in performance debugging and 
application-specific customization of the algorithms involved. 

A search tree is generated by branching algorithms, which at each node pro- 
vide different choices that add new constraints to strengthen the store in the 
child nodes. Propagation algorithms strengthen the store according to the oper- 
ational semantics of constraints in the store, and exploration algorithms decide 
on the order, in which search trees are explored. 

Logic programming proved to be successful in providing elegant means of 
defining branching algorithms, reusing the built-in notion of choice points. 
Constraint programming systems like SICStus Prolog |IntOOJ and GNU Pro- 
log [IDCOOj provide libraries for propagation algorithms and allow the program- 
ming of exploration algorithms on top of the built-in depth-first search (DFS) 
by using meta programming. To achieve a more modular architecture, recent 
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systems moved away from the logic programming paradigm. The ILOG Solver 
library for constraint programming |ILOOOJ allows the user to implement propa- 
gation algorithms in C-|— 1-. The user can implement exploration algorithms using 
objects that encapsulate the state of search. The language Claire |CJL99| al- 
lows for programming exploration algorithms using built-in primitives for state 
manipulation, and the language Oz provides a built-in data structure called 
space | Sch97blSch00| for implementing exploration algorithms. 

At every node in the search tree, the state of variables and constraints is the 
result of constraint propagation of the constraints that were added along the 
path from the root to the node. During search, the nodes are visited in the order 
given by the exploration algorithm. In this paper, we address the question on 
how the state corresponding to a node is obtained or restored. Different systems 
currently provide different ways of restoring the state corresponding to the target 
node. All systems/languages except Oz are based on a state restoration policy 
(SRP) that records changes on the state in a data structure called trail. The 
trail is employed to restore the state back to an ancestor of the current node. 
Schulte 



presents several alternative SRPs based on copying and 
recomputation of states and evaluates their competitiveness conceptually and 
experimentally in jSch99j . 

The best state restoration policy for a given application depends on the 
amount of propagation (state change), the exploration and the branching. The 
goal of this work is to identify software techniques that enable the employment 
of different SRPs in the same system without compromising the orthogonal de- 
velopment of other components such as propagation, branching and exploration. 
The architecture allows the user to optimize time and space consumption of ap- 
plications by choosing existing or designing new SRPs in response to application- 
specific characteristics. We introduce two novel SRP, namely lazy copying and 
batch recomputation, and show experimentally that for many applications, they 
improve the time and/or space efficiency over existing SRPs. State restoration 
is an important aspect of tree search that deserves the attention of users and 
constraint programming systems designers. 

We outline in Section |2] a software architecture for constraint programming 
systems that will form the base for further discussion. The components are de- 
signed and implemented in C-|— I- using the Figaro library for constraint pro- 
gramming |HMN99ICHN00|Ng01| . The Figaro library is available at |Fig01| . In 
Section [21 we describe the two SRPs currently in use, namely trailing and re- 
computation. At the end of Section |3] we give an overview of the rest of the 
paper. 



2 A Component Design for Search 

In CP(FD), the constraint store represents a computational state, hosting finite- 
domain (FD) variables and constraints. A variable has a domain, which is the 
set of possible values it can take. A constraint maintains a relation among a 
set of variables by eliminating values, which are in conflict with the constraint. 
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Fig. 1. Depth-First Tree Search 



from variable domains according to the propagation algorithm. Each time a 
change is made to a constraint store, a propagation engine performs constraint 
propagation until it reaches a fix point, in which no constraint can eliminate 
any more values. In our framework, we represent a constraint store by a data 
structure called store iNjoTl . 

Usually, constraint propagation alone is insufficient to solve a problem. There- 
fore, we need tree search to find a solution. A search explores the tree in a 
top-down fashion. Nodes and branches build up the search tree. It is adequate 
to view search in terms of these components: branching, node and exploration. 
Figure [T] provides an illustration of tree search. Circles represent nodes, while 
lines connecting two nodes represent branches. The numbers inside the nodes 
give the order of exploration. The dashed arrows indicate DFS. For simplicity, 
we only consider binary search trees. 

The branching describes the shape of the search tree. Common branching 
algorithms include a simple labeling procedure (naive enumeration of variables), 
variable ordering (such as first-fail), and domain splitting. For solving scheduling 
problems, more complex branching algorithms, such as resource serialization, are 
used. In our setting, branching coincides with the notion of a choice point. The 
class Brainching shown in Program [T] has a method choose (line 5, for concise- 
ness, we refer to C-|— I- member functions as methods) which adds a constraint 
to the store based on the choice given and returns the branching (choice point) 
of the child node. Brainching also defines methods to check whether it is done 
(line 3) or it has failed (line 4). 



Program 1 Declaration of Branching 

1 class Branching { 

2 public: 

3 bool doneO const; 

4 bool failO const; 

5 Branching* choose (store* s,int i) const; 

6 I; 
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Program 2 Declaration of Node 

1 class Node { 

2 protected: 

3 store* cs; 

4 Branching* branch; 

5 Node* parent , left_child,right_child; 

6 public: 

7 NodeCstore* s, Branching* b) ; 

8 bool isLeafO const; 

9 bool isFailO const; 

10 Node* make_left_child() ; 

11 Node* make_right_child() ; 

12 >; 



A node represents a state in the search tree. The class Node shown in Pro- 
gram [2] contains a store, a branching, and pointers to parent and children nodes 
(line 3-4). The constructor (line 7) takes a store and a branching as argu- 
ments. The left and right children nodes are created by calling the method 
make_left_child and mELke_right_child respectively (line 10-11). Each time a 
child node is created, the branching adds a constraint to the store. To proceed to 
the next level of the search tree, constraint propagation must reach a fix point. 
Node also has methods to check if the node is a leaf node (line 8) or a failure 
node (line 9). 

Figure [2| gives a graphical representation of nodes and branchings. The left 
side shows the design of nodes. A tree is linked bi-directionally, where the parent 
points to the children and vice versa. The right side shows the relation between 
nodes and branchings during the creation of children nodes. Solid arrows rep- 
resent pointers, while labelled, dashed arrows represent the respective method 
calls. Calling the make_left_child or the make_right_child methods creates 
a child node, which, in turn, invokes the method choose of the current node 
branching that returns a branching for the child node. 

The exploration specifies the tree traversal order. DFS is the most common 
exploration algorithm used in tree search for constraint programming. Program 





Fig. 2. Tree Node and Relation with Branching 
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Program 3 Exploration: Depth First Search 

1 Node* DFS(Node* node) { 

2 if (node->isLeaf 0 ) return node; 

3 if (node->isFail 0 ) return NULL; 

4 Node* result = DFS(node->make_left_child() ) ; 

5 if (result != NULL) return result; 

6 return DFS(node->make_right_child() ) ; 

7 >; 



shows the implementation of DFS. Function DFS takes a node as an argument 
and tries to find the first solution using depth-first strategy. It returns the node 
containing the solution (line 2) or NULL if none is found (line 3). Otherwise, it 
recursively finds the solution on the left (line 4-5) and right (line 6) subtrees. 

3 Restoration Policies 

The problem of state restoration occurs in systems where a state results from a 
sequence of complex operations, and where the state corresponding to different 
(sub)sequences are requested over time. For example, in distributed systems, 
state restoration is used to recover from failure in a network node |NX95J . 

In constraint-based tree search, the dominant SRP has been trailing. This 
policy demands to record the changes done on the state in a data structure, 
called trail. To go from a node to its parent, the recorded changes are undone. 
The reason for this dominance lies in the historical fact that constraint pro- 
gramming evolved from logic programming, and that trailing is employed in all 
logic programming systems for state restoration. The combination of the general 
idea of trailing with constraint-programming specific modifications |AB90| , was 
deemed sufficient for constraint programming. 

Schulte |SchOO| shows that other SRPs have appealing advantages. Starting 
from the idea of copying an entire constraint store, he introduced several SRPs 
that trade space for time by recomputing the store from a copy made in an 
ancestor node instead of making a copy at every node ISch99| . These SRPs 
have the advantage of not requiring the recording of changes in propagation 
algorithms, thereby considerably simplifying the design of CP(FD) systems. 

In the design presented in Sectional the SRP is determined by the definition 
of the methods make_lef t.child and make_right_child in the class Node. These 
methods need to create a new node together with its store and branching from 
the information present in the current node. This indicates that we may be able 
to arrive at different SRPs by providing different implementations of the Node 
class, without affecting other components such as branching and exploration. 
The next section shows that it is indeed possible. 

By isolating the SRP in a separate component that is orthogonal to the 
other components, the development of new SRPs may be simplified, which may 
inspire the development of new SRPs. Indeed, we will present two new SRPs in 
Section 0 and El Trailing requires all operations to be search aware, and is not 
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orthogonal to the rest of the system |Sch99] . Section [7| present a variant called 
coarse-grained trailing, which can be implemented as an orthogonal component. 
By having existing and new SRPs available in one system, we are able to conduct 
an experimental evaluation of them with “everything else being equal” ; we report 
the results of this evaluation in Section 0 

4 Restoration Components 

The previous section showed that the Node class is the component that decides 
the SRP. The aim, therefore, is to design different types of nodes for different 
SRPs, namely, CopyingNode for copying and RecomputationNode for recompu- 
tation. All these nodes inherit from the base class Node. Hence, we specify the 
restoration component of search by passing the correct node type as an argu- 
ment. 

The idea for CopyingNode and RecomputationNode is presented in |Sch97aj 
and it allows the Oz Explorer to have copying and recomputation as SRP for 
DPS exploration. We separate the SRP aspect of nodes from the exploration 
aspect by implementing SRP-specific extensions of the Node base class. 

The Node base class is similar to the one introduced in Program [21 except 
that it does not contain a store anymore (remove line 3). Rather, the decision 
on whether to keep a store and on the type of store to keep is implemented in 
the subclasses. 

The copying SRP requires each node of the search tree to keep a copy of the 
store. Hence, the class CopyingNode contains an additional attribute to keep the 
copy. As the store provides a method clone for creating a copy of itself, when a 
CopyingNode explores and creates a child node, it keeps a copy of the store and 
passes the other copy to the child node. 

The recomputation SRP keeps stores for only some nodes, and recomputes 
the stores of other nodes from their ancestors. A parameter called maximum 
recomputation distance (MRD) of n, means that a copy of a store is kept at 
every n-th level of the tree. Figure E shows the difference between copying and 
recomputation with MRD of 2. Copies of the stores are kept only in shaded 
nodes. Copying can be viewed as recomputation with MRD of 1. 

For RecomputationNode, we introduce four attributes: (1) a pointer to store; 
(2) an integer counter d to check, if we have reached the n-th level of the tree; 





Fig. 3. Copying vs. Recomputation 
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Program 4 Recomputing Stores in Search Tree 

1 Store* RecomputationNode: : recompute (int i) { 

2 Store* rs; 

3 if (copy) 

4 rs = cs->clone(); 

5 else 

6 rs = parent->recompute (choice) ; 

7 branch->choose (rs , i) ; 

8 return rs ; 

9 >; 



(3) an integer choice, which indicates, if the node is the first or the second child 
of its parent; and (4) and a boolean flag copy to indicate the presence of a copy 
of a store. If d reaches the n-th level limit when creating a child node, a copy of 
the store is kept and copy is set to true. During the exploration of a node where 
recomputation of the store is needed (i. e. , no copy of store is kept), the method 
recompute shown in Program |4] recursively recomputes for the store from the 
ancestors, by committing each parent’s store to the alternative given by choice 
(line 7). 

Adaptive recomputation (AR) |Sch99| improves recomputation performance 
by keeping only a copy of the store at a depth equidistant from the depth of 
an existing copy (or root, if none exists) and the depth of the last-encountered 
failure. It is straightforward to implement AR by introducing another argument 
to the method recompute which counts the length of the recomputation path. 
The additional copy of the store is made when the counter reaches half the 
length. 

During exploration, it is often clear that the store of a node is not needed 
any longer and can be safely passed to a child. For example in the case of DFS, 
we passed the store to the second child when the first child’s subtree is fully 
explored. For such cases, nodes provide methods create_last_right_child and 
create_last_lef t_child. When a copy-holding node N is asked for its last child 
node A, the node N will pass its store to the child node A, which then becomes a 
copy-holding node. This optimization — described in |Sch00j as Last Alternative 
Optimization — saves space, and performs the recomputation step N —>■ A only 
once. 

Best solution search (for solving optimization problem) such as branch-and- 
bound requires the dynamic addition of constraint during search, which demand 
the next solution to be better than the currently best solution. The Node class 
has a method: 

State post_constraint (BinaryFunction* BF, store* s) ; 

to add this constraint to the store inside a node. This addition is similar to 
the injection of an computation in an Oz space [Sch97bj . The method takes in 
a binary function to enforce the order, and the best solution store. It returns 
FAIL if enforcing the order causes failure. However, care should be taken during 
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recomputation, where not every node in the tree may contain a copy of the store. 
For that, we need to introduce extra attributes to keep the constraints, which 
will be added as recomputation is performed. 



5 Lazy Copying 



Lazy copying is essentially a copy-on-write technique, which maintains multiple 
references to an object. A copy is made only when we write to the object. Figure |4] 
shows the differences between copying and lazy copying. 

Some operating systems use this technique for managing processes sharing 
the same virtual memory MBKQ96] . In ACE | |PGH95| , a parallel implemen- 
tation of Prolog, an incremental copying strategy reduces the amount of infor- 
mation transferred during its share operation. In Or-parallelism, sharing is used 
to pass work from one or-agent to another, and is similar to the lazy copying 
strategy. 

In conventional CP(FD) systems, constraints have direct references (pointers) 
to the variables they use and/or vice versa. In such systems, lazy copying requires 
that every time an object (say O) is written to become N, every object that is 
pointing to O would need to be copied such that each new copy points to N 
while the old copies continue to point to O. This process needs to be executed 
recursively, until copies have been made for the entire connected sub-graph of the 
constraints and the variables. This requirement can be avoided through relative 
addressing |JNg01| , where every reference to an object is an address (or index), 
called ID, into the vector of placeholders. This technique is implemented in 
Figaro, where constraint and variable objects are always referenced through the 
placeholders. 

From a software engineering point of view, the technique allows us to provide 
the same concept for both copying and lazy copying. To support lazy copy- 
ing, we introduce lazy-copying stores that possess the copy-on-write charac- 
teristics for the constraint and variable objects. Conceptually, a lazy copying 
store behaves like a copying store except that its internal implementation delays 
the copying until a write operation on the particular object. The implementa- 
tion of LazyCopyNode is straightforward; we only need to replace the store in 
CopyingNode by a lazy copying store described above. 



6 Batch Recomputation 

Recomputation performs a sequence of constraint additions and fix point com- 
putations. At earlier fix point computations, the implicit knowledge of later 
constraints is not exploited. This means that work is done unnecessarily, since re- 
computation will never encounter failure. Thus, recomputation can be improved 
by accumulating the constraints to be added along the path and invoke the 
propagation engine for computing the fix point only once. Since recomputation 
constraints are added all at once, we call this technique batch recomputation. 
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Fig. 4. Comparison between Copying and Lazy Copying 



Batch recomputation is also applicable to adaptive recomputation, which we call 
batch adaptive recomputation. 

Batch recomputation requires a data structure to record the branching de- 
cision during exploration. The recorded branching decision is useful to add the 
correct constraint in constant time along the recomputation path. Batch re- 
computation also requires a data structure to accumulate the added constraints 
along the recomputation path and the ability to control the propagation for per- 
forming propagation in a single batch. A condition for the correctness of batch 
recomputation is the monotonicity of constraints, meaning that different orders 
of constraint propagation must result in the same fix point. 

The implementation of batch recomputation in our architecture is straight- 
forward. The branching objects provide the facility to record the branching de- 
cisions during exploration. The method choose adds the correct constraint in 
constant time during recomputation. The store uses a propagation queue for 
accumulating the added constraints along the recomputation path and provides 
a feature to disable and invoke propagation explicitly. 

In Mozart/Oz, the process of branching is achieved by communication be- 
tween choice points and engines, which always run in separate threads. The 
communication insists on performing propagation to the fixpoint (in Oz termi- 
nology: until the space is stable), and thus precludes an implementation of batch 
recomputation in an Oz search engine in the current setup. On the other hand, 
it is conceivable that the branching primitive choose is wrapped in a mechanism 
that records the branching decisions, and that a data structure containing these 
decisions is made available to a batch recomputation engine. An alternative is 
to extend spaces by primitives to enable/disable stability enforcement. 

7 Coarse-Grained Trailing 

Coarse-grained trailing is an approximation of trailing as implemented in most 
CP(FD) systems. Instead of trailing updates of memory locations, we trail the 
complete variable object or constraint object when changes occur. As mentioned 
in Section 0 our architecture provides a relative addressing scheme and allows 
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shared trail 




Fig. 5. Coarse-grained Trailing 



to make copies of variables and constraints, which make the implementation 
simple. 

Coarse-grained trailing only keeps a single store for the entire exploration. 
Figure E] shows its implementation. A half-shaded node represents a trailing 
node and arrows represent pointers. A trailing node holds a pointer to a common 
shared trail. The shared trail contains a trailing store and a pointer to the current 
node where the store is defined. A trailing store is needed because of the strong 
dependency between the store and the actual trail. 

Program [S] shows the declaration of the trailing node and shared trail. The 
class TrailingNode implements the coarse-grained trailing SRP. It contains an 
integer mark, which represents the trail marker for terminating backtracking 
(line 2). This corresponds to the time stamping technique |AB90| . The integer i 
(line 2) indicates whether the node is the first or second child of its parent. The 
constructor of the class SharedTrail takes a store and a pointer to the root 
node as argument (line 10). When exploring a node D, which is not pointed to 
by the current node, the method jump (line 12) changes the trailing store from 
the current node to the node D. First, jump computes the path leading to the 
common ancestor with method computePath (line 11), then backtracks to the 
common ancestor, and finally descends to node D by recomputation. 

The implementations of trailing and lazy copying store are closely related, 
since both create a copy of the changed object before a state modification occurs. 
Compared to trailing, the coarse granularity imposes an overhead, which grows 
with the complexity of the constraints (global constraints). If the constraints 
contain large stateful data structures, trailing may record incremental changes 
as opposed to copying the whole data structure on the trail as it is done by 
coarse-grained trailing. 



8 Experiments 

This section compares and analyses the runtime and memory profile of the dif- 
ferent SRPs. The experiments are run on a PC with 400 MHz Pentium II proces- 
sor, 256MB main memory and 512MB swap memory, running Linux (RedHat 
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6.0 Kernel 2.2.17-14). All experiments are conducted using the current devel- 
opment version of the Figaro system [HMN99ICHN00|Ng01| , a C-|— I- library 
for constraint programming. The Figaro library is distributed under the Lesser 
GNU Public License iFljOTI , and all benchmark programs are included in the 
distribution. 

The SRPs are denoted by the following symbols: CP - Copying, TR - Coarse- 
grained Trailing, LC - Lazy copying, RE - Recomputation, AR - Adaptive recom- 
putation, BR - Batch Recomputation, BAR - Batch Adaptive Recomputation. 
To facilitate the comparison, the maximal recomputation distance MRD for RE, 
AR, BR and BAR is computed using the formula: MRD = \ depth 5] where 

depth is depth of the search tree. All benchmark timings (Time) are the average 
of 5 runs measured in seconds, and have been taken as wall clock time. The coef- 
ficient of variation is less than 5%. Memory requirements are measured in terms 
of maximum memory usage (Max) in kilobytes (KB). It refers to the memory 
used by the C-| — I- runtime system rather than the actual memory usage because 
CH — h allocates memory in chunks. 

The set of benchmark problems are: The Alpha crypto-arithmetic puzzle, 
the Knights tour problem on an 18 x 18 chess board, the Magic Square puzzle 
of size 6, a round robin tournament scheduling problem with 7 teams and a 
resource constraint that requires fair distribution over courts (Larry), aligning 
for a Photo, a Hamiltonian path problem with 20 nodes, the ABZ6 Job shop 
scheduling benchmark, the Bridge scheduling benchmark with side constraints, 
and 100-S-Queens puzzle that uses three distinct (with offset) constraints. 

Table [U lists the characteristics of the problems. These benchmarks provide 
the evaluation of the different SRPs based on the following criteria: problem 
size, amount of propagation, search tree depth, and number of failures. Our 
comparison of the different SRPs are based on “everything else being equal”, 
meaning all other elements such as store, branching, exploration, etc. are kept 
unchanged except the SRP. 



Program 5 Shared Trail and Trailing Node 

0 class TrailingNode : public Node { 

1 protected: 

2 int i,mark; SharedTrail* trail; 

3 public: // methods declaration... 

4 >; 

5 

6 class SharedTrail ■[ 

7 private : 

8 TrailingStore* ts; TrailingNode* current; 

9 public: 

10 SharedTrail (Store* s .TrailingNode* tn) ; 

11 list<TrailingNode*> computePath (TrailingNode* tn) ; 

12 void jump (TrailingNode* tn) ; 

13 I; 
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Since different components of a CP(FD) system are dependent on one an- 
other, the performance may vary. For instance, the choice of FD representation 
has a significant effect on the performance. For these experiments, the FD rep- 
resentation is a lists of interval. Some problems may perform differently when a 
bit vector representation is used. Another remark is that the speed of copying 
between our system and Mozart is different for the following reasons: differ- 
ent FD representations, amount of data being copied, variable wake up scheme 
during propagation, and memory management (Mozart uses automatic garbage 
collection). Therefore, the results do not match exactly with Schulte |Sch!TO] . 



Table 1. Characteristics of Example Programs 



example 


search 


choice 


fail 


soln 


depth 


var 


constr 


Alpha 


all/naive 


7435 


7435 


1 


50 


26 


21 


Knights 


one/naive 


266 


12 


1 


265 


7500 


11205 


Magic Square 


one/split 


46879 


46829 


1 


72 


37 


15 


Larry 


one/naive 


389 


371 


1 


40 


678 


1183 


Photo 


best / naive 


23911 


23906 


6 


34 


95 


53 


Hamilton 


one/naive 


7150 


7145 


1 


66 


288 


195 


ABZ6 


best/rank 


2409 


2395 


15 


91 


102 


120 


Bridge 


best/rank 


1268 


1261 


8 


78 


44 


88 


100-S-Queen 


one/ff 


115 


22 


1 


97 


100 


3 



Table 2. Runtime and Memory Performance of Copying 



Example 


Time 


Max 


Example 


Time 


Max 


Alpha 


19.200 


1956 


Hamilton 


50.514 


2176 


Knights 


22.086 


330352 


ABZ6 


25.004 


4936 


Magic Square 


160.360 


2632 


Bridge(lOx) 


8.582 


2888 


Larry 


5.844 


5712 


lOO-S-Queen(lOx) 


8.444 


7816 


Photo 


35.086 


1912 









Table El gives the runtime and memory performance of copying. Figure O 
shows the comparison of coarse-grained trailing and recomputation. The num- 
bers are obtained by dividing the performance of each SRP by the performance 
of copying. A value below 1 means better performance, while a value above 1 
means worse performance than copying. This group of comparison confirms the 
following result of Schulte [Sch99] . Copying suffers from the problem of memory 
swapping for large problems with deep search trees such as Knights. Recom- 
putation improves copying by trading space for time. Adaptive recomputation 
minimize the penalty in runtime of recomputation by using more space. 

Coarse-grained trailing performs comparatively well to copying and other 
recomputation schemes. The memory peaks in Photo is probably due to STL 
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library dynamic array memory allocation module which grows the array size by 
recursive doubling. Coarse-grained trailing provides us with an approximation 
for comparing the performance of trailing and recomputation. 

Lazy copying aims at combining the advantages of both coarse-grained trail- 
ing and copying. Figure [ 7 ] shows its performance against both SRPs, the numbers 
are obtained by dividing lazy copying’s numbers by copying’s and coarse-grained 
trailing’s numbers. Over the benchmark problems, in the worst case, lazy copying 
performs the same as copying, while for the cases with small amount of propaga- 
tion, lazy copying can save memory and even time. Unfortunately, lazy copying 
still performs badly for large problems with deep search trees such as Knights, 
when compared to coarse-grained trailing. This is due to the extra accounting 
data we keep for lazy copying. However, lazy copying improves the runtime over 
coarse-grained trailing for problems like Magic Square, Larry and Bridge, where 
there are many failure nodes, because lazy copying can jump directly from one 
node to another upon backtracking, while coarse-grained trailing has to carry 
out the extra operation of undoing the changes. 

Batch recomputation aims at improving the runtime performance of recom- 
putation. The memory requirement is the same as recomputation. Figure |§] 
shows the runtime performance of batch recomputation versus recomputation 
and batch adaptive recomputation versus adaptive recomputation. Batch recom- 
putation improves the runtime of recomputation for all cases. However, batch 
adaptive recomputation improve only little over adaptive recomputation except 
for Larry. This is due to the design of adaptive recomputation which makes a 
copy in the middle when a failure is encountered, which in turn, reduces the 
recomputation distance that batch recomputation can take advantage of. 

Comparison with other constraint programming systems are needed in order 
to gauge the effect of the component architecture and the overhead for relative 
addressing. Initial results are reported in |Ng01|. 



Time of TR, RE, AR vs. CP 



Memory of TR, RE, AR vs CP 




Fig. 6. Performance of Coarse-grained Trailing and Recomputation vs. Copying 
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Lazy Copying vs. Copying 



Lazy Copying vs. Coarse-grained Trailing 




Fig. 7. Performance of Lazy Copying vs. Copying and Coarse-grained Trailing 



9 Conclusion 

We developed an architecture that allows us to isolate the state restoration policy 

(SRP) from other components of the system. Its main features are: 

Relative addressing: Variable and constraint objects are referred to by IDs, 
which are mapped to actual pointers through store-specific vectors. 

Branching objects: Search trees are defined by branching objects, which are 
recursive choice points. 

Exploration algorithms: Exploration algorithms are defined in terms of a 
small number of operations on nodes. 

SRPs are represented by different extensions of the base class Node. Apart from 

the existing copying and recomputation SRPs, we introduced the following two 

new SRPs. 

Lazy copying uses a copy-on-write technique for variables and constraints and 
improves over or is equally good as copying on all benchmarks. Lazy copying 
benefits from a relative addressing implementation. 



BR vs RE and BAR vs AR 




Fig. 8. Time of Batch Recomputation vs. Recomputation 



254 



C.W. Choi, M. Henz, and K.B. Ng 



Batch recomputation modifies recomputation by installing all constraints to 
be added to the ancestor at once and improves over Schulte’s recomputation 
for all benchmarks. 

The presented architecture allows the user to optimize time and space consump- 
tion of applications by choosing existing or designing new SRPs in response 
to application-specific characteristics. The SRP components are designed and 
implemented in C-|— I- on the base of the Figaro library for constraint program- 
ming | |HMN99ICHN00|Ng01] , and evaluated on a set of benchmarks ranging from 
puzzles to realistic scheduling and timetabling problems. The library and bench- 
marks are distributed at [Fig01| . State restoration is an important aspect of tree 
search that deserves the attention of users and constraint programming systems 
designers. From the experiments, we concluded that SRP is problem dependent. 
It is interesting to study what kind of problem structure would benefit from 
what SRP, which leads to optimizing the time and space consumption of tree 
search. 
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able feedback on this paper, Ong Kar Loon for continuous discussions and col- 
laboration on the Figaro library, and Edgar Tan for comments. 



References 



[AB90] 

[CHNOO] 

[CJL99] 

[DCOO] 

[FigOl] 

[HMN99] 



Abderrahamane Aggoun and Nicolas Beldiceanu. Time Stamps Techniques 
for the Trailed Data in Constraint Logic Programming Systems. In Actes 
du Seminaire 1 990-Programmation en Logique, pages 487-509, Tregastel, 
France, May 1990. CNET. 

Tee Yong Chew, Martin Henz, and Ka Boon Ng. A toolkit for constraint- 
based inference engines. In Enrico Pontelli and Vitor Santos Costa, editors. 
Practical Aspects of Declarative Languages, Second International Work- 
shop, PADL 2000, Lecture Notes in Computer Science 1753, pages 185-199, 
Boston, MA, 2000. Springer- Verlag, Berlin. 

Yves Caseau, Frangois-Xavier Josset, and Frangois Laburthe. CLAIRE: 
Combining sets, search and rules to better express algorithms. In Dan- 
ny De Schreye, editor. Proceedings of the International Conference on Log- 
ic Programming, pages 245-259, Las Cruces, New Mexico, USA, 1999. The 
MIT Press, Cambridge, MA. 

Daniel Diaz and Philippe Codognet. The GNU prolog systems and its 
implementation. In ACM Symposium on Applied Computing, Como, Italy, 
2000. Documentation and system available at 
http : //www.gnu. org/sof tware/prolog, 

Fgaro library for constraint programming. Documentation and system 
available from http : //f igaro . comp . nus . edu . sg, Department of Comput- 
er Science, National University of Singapore, 2001. 

Martin Henz, Tobias Muller, and Ka Boon Ng. Figaro: Yet another con- 
straint programming library. In Proceedings of the Workshop on Paral- 
lelism and Implementation Technology for Constraint Logic Programming, 
Las Cruces, New Mexico, USA, 1999. held in conjunction with ICLP’99. 



Components for State Restoration in Tree Search 



255 



[ILOOO] 

[IntOO] 

[MBKQ96] 

[NgOl] 

[NX95] 

[PGH95] 

[Sch97a] 

[Sch97b] 

[Sch99] 

[SchOO] 



ILOG Inc., Mountain View, CA 94043, USA, http://www.ilog.com. 
ILOG Solver 5.0, Reference Manual, 2000. 

Intelligent Systems Laboratory. SICStus Prolog User’s Manual. SICS Re- 
search Report, Swedish Institute of Computer Science, 

URL http : //www. sics . se/ isl/sicstus .html, 2000. 

Marshall Kirk McKusick, Keith Bostic, Michael J. Karels, and John S. 
Quarterman. The Design and Implementation of the f.fBSD Operating 
System. Addison- Wesley, Reading, MA, 1996. 

Ka Boon Kevin Ng. A Generic Software Framework For Finite Domain 
Gonstraint Programming. Master’s thesis. School of Computing, National 
University of Singapore, 2001. 

R. H. B. Netzer and J. Xu. Necessary and sufficient conditions for con- 
sistent global snapshots. IEEE Transactions on Parallel and Distributed 
Systems, (6): 165-169, 1995. 

Enrico Pontelli, Gopal Gupta, and Manuel Hermenegildo. &ACE: A high 
performance parallel prolog system. In 9th International Parallel Process- 
ing Symposium, pages 564-571. IEEE Press, 1995. 

Christian Schulte. Oz Explorer: A visual constraint programming tool. 
In Lee Naish, editor. Proceedings of the International Gonference on Log- 
ic Programming, pages 286-300, Leuven, Belgium, July 1997. The MIT 
Press, Cambridge, MA. 

Christian Schulte. Programming constraint inference engines. In Gert 
Smolka, editor, Principles and Practice of Constraint Programming — 
CP97, Proceedings of the Third International Conference, Lecture Notes 
in Gomputer Science 1330, pages 519-533, Schloss Hagenberg, Linz, Aus- 
tria, October/November 1997. Springer- Verlag, Berlin. 

Christian Schulte. Comparing trailing and copying for constraint program- 
ming. In Danny De Schreye, editor. Proceedings of the International Con- 
ference on Logic Programming, pages 275-289, Las Cruces, New Mexico, 
August 1999. The MIT Press, Cambridge, MA. 

Christian Schulte. Programming Constraint Services. Doctoral disserta- 
tion, Universitat des Saarlandes, Naturwissenschaftlich-Technische Fakul- 
tat I, Fachrichtung Informatik, Saarbriicken, Germany, 2000. To appear in 
Lecture Notes in Artificial Intelligence, Springer- Verlag. 



Adaptive Constraint Handling with CHR in Java 



Armin Wolf 
Fraunhofer Gesellschaft 

Institute for Computer Architecture and Software Technology (FIRST) 
Kekul&trafie 7, D- 12489 Berlin, Germany 
Armin. Wolf @fir St . f raunhof er . de http: //www. first .f raunhof er . de 



Abstract. The most advanced implementation of adaptive constraint 
processing with Constraint Handling Rules (CHR) is introduced in the 
imperative object-oriented programming language Java. The presented 
Java implementation consists of a compiler and a run-time system, all 
implemented in Java. The run-time system implements data structures 
like sparse bit vectors, logical variables and terms as well as an adaptive 
unification and an adaptive entailment algorithm. Approved technolo- 
gies like attributed variables for constraint storage and retrieval as well 
as code generation for each head constraint are used. Also implemented 
are theoretically sound algorithms for adapting of rule derivations and 
constraint stores after arbitrary constraint deletions. The presentation 
is rounded off with some novel applications of CHR in constraint pro- 
cessing: simulated annealing for the n queens problem and intelligent 
backtracking for some SAT benchmark problems. 



1 Introduction 

Java is a state-of-the-art, object-oriented programming language that is well- 
suited for interactive and/or distributed problem solving [215] ■ The development 
of graphical user interfaces is well supported by the JavaBeans concept and 
the graphical components of the Swing package (cf. |3]). There are several ap- 
proaches using constraint technologies for (distributed) constraint solving that 
are based on Java (e.g. [5iw5] ). m in particular is a recent approach that in- 
tegrates Constraint Handling Rules into Java. Constraint Handling Rules (CHR) 
are multi-headed, guarded rules used to propagate new or simplify given con- 
straints m- However, this Java implementation of CHR only supports chrono- 
logical backtracking for constraint deletions, similar to the implementations of 
CHR in ECLiPSe [S] and SICStus Prolog |TT]. Arbitrary additions and deletions 
of constraints that may arise in interactive or even distributed problem solving 
environments are not directly supported. These restrictions have been removed 
by previous - mainly theoretical - work dii. However, an implementation of 
a CHR system that allows arbitrary additions and deletions of constraints was 
not yet available. 

This paper presents a first implementation of adaptive constraint handling 
with CHR (c.f. m)- The implementation language is Java. This imperative 

T. Walsh (Ed.): CP 2001, LNCS 2239, pp. 256- l270l 2001. 

(c) Springer- Verlag Berlin Heidelberg 2001 
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programming language was chosen because of its properties (see above) and 
because it has no integrated, fixed add/delete mechanism for constraints like 
Prolog. This latest and advanced implementation of CHR improves the previous 
implementation in terms of flexibility and/or efficiency. For the user, this CHR 
implementation offers well-established aspects like 

— no restriction of the number of heads in a rule 

— compilation of rules in textual order 

— constant time access to constraints 

— code is compiled not interpreted 

and opens up new application areas for CHR in constraint solving: 

— local search 

— back-jumping and dynamic backtracking 

— adaptive solution of dynamic problems 

There are several CHR examples in this paper. However, one example will 
guide us through the chapter on the system. This example is not a typical con- 
straint handler, but it is small and still illustrates various considerations and 
stages during compilation and use of CHR in Java. 

Example 1 (Primes). The sieve of Erathosthenes may be implemented as a kind 
of a “chemical abstract machine” (c.f. |H]): Assuming that for an integer n > 2, 
the constraints prime (2), . . . , prime (n) are generated. The CHR 

prime(I) \ prime(J) <=> J mod I == 0 I true. 

will filter out all non-prime “candidates”. If the rule no longer applies, only the 
constraints prime (p), where p is a prime number, are left. More specifically, if 
there is a constraint prime (0 and some other constraint prime (j) such that 
j mod i = 0 holds, then j is a multiple of i, i.e. j is non-prime. Thus, prime (i) 
is kept but prime (j) is removed. In addition, the empty body of the rule (true) 
is executed. 

The paper is organized as follows. First, the syntax and operational semantics 
of CHR are briefly recapitulated. Then, the system’s architecture, interfaces and 
performance are described. Specifically, the primes sieve is used as a benchmark 
to compare the runtime of the system with the recent implementation of CHR 
in SICStus Prolog. Some novel applications of CHR complete the presentation. 
The paper closes with some conclusions and a brief outline of future work. 



2 The Syntax and Operational Semantics of CHR 

Some familiarity with constraint logic programming is assumed (e.g. [IRijl. The 
presented CHR implementation supports a restricted set of built-in constraints, 
which are either syntactic equations or arithmetic relations over a predefined set 
of arithmetic terms (for details, see [TH]). Arbitrary host language statements as 
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in the SICStus implementation of CHR (see [TT]) are not (yet) supported. One 
reason is that for every host language statement in the body of a CHR there 
must be an undo-statement, which is executed whenever applications of this rule 
are no longer valid. 

2.1 Syntax 

There are three kinds of CHR: 



Simplification: 


Hi,.. 


. . , Hi 4^ Gi, . . 


.,G, 


1 Hi,... 




Propagation : 


Hi,.. 


. . ,Hi ^ Gi, . . 


.,G, 


1 Hi,... 


, Hfc . 


Simpagation: 


Hi,.. 


■ j Hjji\H,ji_^i , . 


..,H, 


<t4> Gi, 


. ■ . ,Gj 1 Bi, ... ,Bf^ 



The head Hi, Hi is a, non-empty, finite sequence of CHR constraints, which 
are logical atoms. The guard Gi, . . . ,Gj is a possibly empty, finite sequence of 
built-in constraints, which are either syntactic equations or arithmetic relations. 
If the guard is empty, it has the meaning of true. The body Hi, ... , Bk, is a 
possibly empty, finite sequence of built-in or CHR constraints. If the guard is 
empty, it has the meaning of true. 

2.2 Operational Semantics 

The operational semantics of CHR in the actual implementation (for details, 
see [18119] ) is compatible with the operational semantics given in m- Owing 
to lack of space, a repetition of the formal definitions is omitted, though an 
informal description of the operational behaviour of CHR is given, adopting the 
ideas presented in m-- a CHR constraint is implemented as both code (a Java 
method) and data (a Java object), an entry in the constraint store. Whenever 
a CHR constraint is added (executed) or woken (re-executed), the applicability 
of those CHRs is checked that contain the executed constraint in their heads. 
Such a constraint is called active] all other constraints in the constraint store are 
called passive. 

Head. The head constraints of a CHR serve as constraint patterns. If the active 
constraint matches a head constraint of a CHR, passive partner constraints are 
searched that match the other head constraints of this CHR. If matching partners 
are found for all head constraints, the guard is executed. Otherwise, the next 
CHR is tried. 

Guard. After successful head matching, the guard must be entailed by the built- 
in constraints. Entailment means that all arithmetic calculations are defined, i.e. 
variables are bound to numerical values, arithmetic tests succeed and syntactical 
equations are entailed by the current constraint store, e.g. 3Y {X = f{g{Y))) 
is entailed by the equations X = f{Z) and Z = g{l). If the guard is entailed 
the CHR applies and the body is executed. Otherwise, either other matching 
partners are searched or, if no matching partners are found, the next CHR is 
tried. 
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Body. If the firing CHR is a simplification, all matched constraints (including 
the active one) are removed from the constraint store and the body constraints 
are executed. In the case of a simpagation, only the constraints that match the 
head constraints after the ‘\’ are removed. In the case of a propagation, the 
body is executed without removing any constraints. It should be noted that a 
propagation will not fire again with the same matching constraints (in the same 
order). If the active constraint has not been removed, the next CHR is tried. 



Suspension and Wakeup. If all CHR have been tried and the active constraint 
has not been removed, it suspends until a variable that occurs in it becomes 
more constrained by built-in constraints, i.e. is bound. Suspension means that 
the constraint is inserted in the constraint store as data. Wakeup means that 
the constraint is re-activated and re-executed as code. 

3 The System 

In the beginning, only the runtime system 
and the compiler are given. CHR handlers 
and applications are the responsibility of the 
user. The runtime system and the compiler 
contain the data structures that are required 
to define rule-based adaptive constraint 
solvers and to implement Java programs that 
apply these solvers to dynamic constraint 
problems. The definition of a rule-based 
constraint solver is quite simple: the CHRs 
that define the solver for a specific domain 
are coded in a so-called CHR handler. A 
CHR handler is a Java program that uses 
the compiler in a specific manner. Compiling 
and running a CHR handler generates a 
Java package containing Java code that implements the defined solver and its 
interface: the addition or deletion of user-defined constraints or syntactical 
equations, a consistency test and the explanation of inconsistencies. This 
problem-specific solver package may be used in any Java application. Figure [T] 
shows the components and their interactions. 

3.1 The Runtime System 

The core of the adaptive constraint-handling system is its runtime system. 
Among other things, it implements attributed logical variables (the subclass 
Variable of the class Logical) as presented in [TD], logical terms (the subclass 
Structure of Logical) and data structures for CHR and built-in constraints. 
For dynamic constraint processing, constraints are justified by integer sets. These 
sets are implemented as sparse bit vectors (the class SparseSet, c.f. [TS]). This 




generates 



Fig. 1. The architecture of the 
adaptive CHR system. 
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implementation is much more storage- and runtime-efficient than the bit-sets in 
the Java apiQ Based on these sets and the other data structures, an adaptive 
unification algorithm m and an adaptive entailment algorithm | 16| is imple- 
mented. The runtime system is the common basis for 

— the compiler 

— the CHR handlers 

— the generated handler packages 

— the applications using the handlers 



3.2 The Compiler and Its Interface 

The compiler class is also written in Java. Logical term objects that represent 
CHR heads, guards and bodies may be added to a compiler object. Thus, a 
parsing phase that transfers CHR into an internal representation is unnecessary. 
All CHRs are represented in a canonical form, which allows uniform treatment 
of simplifications, propagations and simpagations (c.f. [1 1 j). This form consists 
of 



— a (remove) array of all head constraints that are removed when the rule is 
applied 

— a (keep) array of all head constraints that are kept when the rule is applied 

— an array of all guard conditions that have to be entailed 

— an array of all body constraints that are added when the rule is applied 

At most one of the two arrays of head constraints may be empty. To define a 
CHR-based constraint solver, the canonical form of the rules has to be added in 
a CHR handler to a compiler object. 

Example 2 (Primes, eontinued). The canonical representation of the simpaga- 
tion prime (I) \ prime (J) <=> J mod I == 0 I true, in Java is shown in 
the CHR handler for the primes sieve presented in Figure |2] The head variables 
are defined in lines 5 and 6. In line 7, the functor of the unary constraint prime is 
defined. In lines 8 and 9, the head constraints are constructed. The guard condi- 
tion is constructed in lines 10-12, where the built-in modulo operator mod_2 and 
the built-in predicate identical_2 (the equivalent of Prolog’s ‘==’) are used. In 
lines 14-17, the canonical form of the rule is added to the compiler object. 

When all rules have been added, the compilation has to be activated. The 
compiler method compileAllO that activates the translation phase is called 
(c.f. Figure [2[ line 18). The generated methods for the active constraints 

— match formal parameters to actual arguments of the active (head) constraint 

— find and match passive partners for the remaining head constraints 

— check the guards 

^ Experiments have shown that the improvement is at least one order of magnitude 
for randomly generated sparse sets. 
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import common.*; // import the runtime system 
import compile. DJCHR; // import the compiler class 
public class primeHandler { 

public static void main( String [] args ) { 

Variable i = new Variable ("I") ; 

Variable j = new VariableC'J") ; 

Functor prime_l = new Functor ("prime" , 1); 

Structure prime_i = new Structure (prime_l , new Logical []{ i }■) ; 
Structure prime_j = new Structure (prime_l , new Logical []{ j >) ; 
Structure cond = new Structure (DJCHR. identical_2 , new Logical [] { 
new Structure (DJCHR. mod_2, new Logical[]{ j, i }) , 
new ZZ(0) 3-); // j mod i == 0 

DJCHR djchr = new DJCHRC'prime" , new Structured 1 prime_l }) ; 
djchr .addRule(new Structure [] { prime_j y, 
new Structured 1 prime_i }, 
new Structured 1 cond 
null) ; 

djchr . compile All 0 ; 

} 

> 



Fig. 2. The CHR handler for the sieve of Eratosthenes. 
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public boolean prime_l_0_0 (Constraint pcO, Logical [] args, SparseSet label) { 
pcO.lockO ; 

SolutionTriple etriple = new SolutionTripleO ; etriple . addToLabel (label) ; 

Logical tmplogical; SparseSet tmplabel; 
primeVariable localO = primeVariable.newLocalC'J") ; 
localO . lbind(args [0] , label) ; 
booleein applied = false; 
search: do { 

primeVariableTable . Stepper stl 

= primeVarTab . initIteration(new primeVariable [] { }, 0); 

while (st 1 .hasNext 0 ) { 

Constraint pci = stl.nextO; 
if ( !pcl . isUsable 0 ) continue; 

SparseSet plabl = (SparseSet)pcl . getLabel () ; 

SolutionTriple . Point pointl = etriple . setPoint () ; 
etriple . addToLabel (plabl) ; 

primeVciriable locall = primeVariable .newLocal ("I") ; 
local 1 . lbind(pcl . get Args () [0] , plabl) ; 
do t 

SpeirseSet guardLabelO = new SparseSetO; 

Logical logicalO = localO .deref (guardLabelO) ; 

Logical logicall = locall .deref (guardLabelO) ; 

if ( ! (logicalO instanceof ZZ && logicall instanceof ZZ && 

( ( ( (ZZ)logicalO) . val °A ((ZZ)logicall) . val) == 0)) ) 
continue ; 

etriple . addToLabel (guardLabelO) ; 
etriple. add (new Conditional ( 

new guard_0_0 (new primeVariable [] {localO, locall}), guardLabelO)); 
if ( ! etriple . getLabel () . isEmpty () ) 
derivation . add( 

new RuleState_0(-l , new primeVariable [] {localO, locall}, 
new Constraintd {pcO, pci}, (SolutionTriple) etriple . clone ())) ; 
primeVarTab . removeConstraint (pcO) ; 
applied = true; 
break search; 

} while (false) ; 

etriple. backToPoint(pointl) ; 

} // end of iteration 
} while (false) ; 
pcO.unlockO ; 
return applied; 



Fig. 3. Code generated for prime (J) in prime(I)\prime(J)<=>J mod l==0|true. 
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— remove matched constraints from the constraint store if required 

— execute the bodies 

Furthermore, for adaptations after constraint deletions, all constraints are justi- 
fied by a set of integers. These justifications are used in the generated methods 
to perform truth maintenance. The generated methods additionally 

— unite all justifications of all constraints that are necessary for successful head 
matching 

— unite all justifications of all constraints that are necessary for guard entail- 
ment 

— justify the executed body constraints with the union of the justifications for 
head matching and guard entailment 

— store justifications and partners of the applied rules in rule state objects 

For adaptation after deletions, a rule state class is generated for each CHR. 
Every rule state class contains a method that retries a previously applied rule if 
its present justification is no longer valid. If there is no alternative justification, 
the previous rule application is undone: removed head-matching constraints are 
re-inserted in the constraint store or re-executed and the consequences of the 
executed body constraints are erased. 



Finding Partner Constraints. Like we believe the real challenge in im- 
plementations of multi-headed CHRs is efficient computation of joins for partner 
constraints. A naive solution is to compute the cross-product of all potential part- 
ner constraints. However, if there are shared variables in the head constraints, 
only a subset of the cross-product has to be executed. If we consider, for instance, 
the transitivity rule leq(X,Y) , leq(Y,Z) ==> leq(X,Z), which has to be tried 
against all active constraints leq(u , v) , only leq constraints have to be consid- 
ered as potential partners that have either v in their first argument position or u 
in their second. In order to (partially) apply this knowledge, the idea of variable 
indexing (c.f. m) is also implemented in our compiler. Thus, the partner search 
is better focused if the arguments of the active constraints are variables, e.g. 
if u and v are variables. The constraints in the store are therefore distributed 
over all variables that occur in these constraints. The constraints are attached to 
their variables as attribute values (c.f. m)- The attributes are named after the 
constraints. For efficient 0(1) access to these constraints, the compiler generates 
for every CHR handler a subclass of variables to which the necessary attributes 
are added. All constraints defined in the handler must therefore be known by 
the compiler. This information is passed on when a compiler object is created 
(e.g. in line 13 in Figure [3. The name of the variable subclass accommodates 
this, receiving the handler’s name as a prefix (e.g. primeVariable for the prime 
handler in Figure |3- 

Unlike the SICStus Prolog implementation, the attribute values are not 
merged when a variable binding occurs. If there is a variable binding X = 
/(. . . y . . .) or A = F in SICStus Prolog, the attribute values stored under X 
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are added to the attribute values in Y because all variable occurrences of X in 
constraints are ’’substituted” hy f{. . .Y ...) or Y , respectively. In our implemen- 
tation however, only a “back pointer” {X ^ Y) from Y to AT is established. The 
variables, together with these “back pointers”, define graph structures; more 
precisely, rational tree^ that are traversed to access all the attribute values, 
i.e. the constraints stored under an unbound variable. This design decision was 
made because variable bindings caused by built-in constraints might be arbi- 
trarily deleted. In the case of a deletion of AT = f{...Y...) or X = Y, only 
the binding itself and the “back pointer” from Y to X have to be deleted. The 
connected attribute values of X and Y are automatically separated because the 
attribute values of X are no longer accessible from Y , the connecting link being 
removed. This approach is much simpler and more efficient than restoring the 
attribute values. 

Example 3 (Primes, continued) . The compiled method for the head constraint 
prime(J) in the CHR prime(I) \ prime(J) <=> J mod I == 0 I true, is 
presented in Figure El The formal parameter J (line 5) is matched to the actual 
argument args [0] (lines 1 and 6) of the active constraint pcO. To find a partner 
constraint matching prime (I), an iteration over all stored constraints is acti- 
vated until one is found that satisfies the guard condition (lines 9-38). Variable 
indexing is impossible because there are no common formal head parameters 
(the array of common primeVariable in line 10 is empty). The iteration con- 
tinues with the next candidate if the current candidate is already being used 
(line 13). Otherwise, the formal parameter I is matched to the actual argument 
of the candidate pci .get Args () [0] (lines 17 and 18). Then, the guard is tested 
(lines 20-24) and the iteration continues with another candidate if the condition 
J mod I == 0 is not satisfied (line 25). Otherwise, the rule is applicable and 
the rule body is normally executed. In this case, the body is empty (true) and 
so only the united justifications (lines 3, 16, 26) for head matching and guard 
entailment and the partners are stored for adaptation (lines 30-32) if necessary. 
No adaptation is necessary if the union of all justifications is empty, i.e. always 
true (c.f. line 29). Last, the active constraint is deleted (line 33) and a Boolean 
value is returned (line 41). It is true iff the rule was successfully applied and the 
active constraint was deactivated. This flag is used to prevent the method for 
the other head constraint prime (I) from being activated on pcO. 



3.3 The Application Interface 

During the translation phase for each head constraint of a CHR, a Java method 
is generated. Methods for constraints that have the same name and arity are 
subsumed under a method that is named after the constraints and their arities. 
Furthermore, for each constraint name and arity, there is a method for reading 
the corresponding constraints out of the constraint store. These methods form 
the “generic” part of the application interface of the generated constraint solver. 

Variable bindings like X = f{Y) and Y = g{X) are allowed, resulting in X ^ Y . 



2 
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They are complemented by “non-generic” methods to add syntactical equations 
to the constraint store, to delete all constraints with a specific justification, to 
test the consistency of the stored built-in constraints and to get an explanation 
(justification) for an inconsistency. 



Example 4- The application interface generated for the CHR handler in Figure|2] 
comprises the following methods 



— public 

— public 

— public 

— public 

— public 

— public 



void primel (Logical [] args , SparseSet label) 
ArrayList get_prime_l () 

void equal (Logical lbs, Logical rhs, SparseSet 
void delete (SparseSet del) 
boolean getStatusO 
SparseSet getExplanationO 



lab) 



of the class prime, the class of constraint stores that are processed by the CHR 
defined in the CHR handler. The variable subclass primeVariable of Variable 
is generated, too0 

The use of the interface is shown in the following program: 

01 import common.*; // import the runtime system 

02 import prime; // import the generated prime handler 

03 public class primeTest { 

04 public static void main( String [] args ) { 

05 int n = Integer .parseIntC args[0] ); 

06 prime cs = new primeO; 

07 for (int i=2; i <= n; i++) 

08 cs .prime_l (new Logical[] {new ZZ(i)}, new SparseSet (i) ) ; 

09 cs . delete (new SparseSet (2) ) ; 

10 cs.prime_l(new Logical[] {new ZZ(2)}, new SparseSet (2) ) ; 

11 > 

12 } 



In lines 7-8, the constraints prime (2), ..., prime (n) are executed, where n 
- a positive integer - is read from the command line (see line 5). In line 9, 
the constraint prime (2) (the only constraint justified by 2) is deleted and then 
re-added in line 10. 



3.4 Runtime Comparisons 

The sieve of Erathosthenes was used as a benchmark to compare the adaptive 
Java version with the recent SICStus Prolog implementation of CHR. In partic- 
ular, the Java program presented in Example[4l which uses the compiled code of 
the handler in Figure E] was compared to the following SICStus Prolog program 

primetest(N) 

switch(Start , Phase) , generate(3,N) , runtime (End) , 

Time is End-Start, print (Time), nl, 

Phase=delete , % causes backtracking and the deletion of prime(2) 
runtime (StartReAdd) , prime(2), runtime(EndReAdd) , 

ReAddTime is EndReAdd-StartReAdd, print (ReAddTime) , nl. 
switch(Time .process) runtime (Time) , prime(2). 
switch(Time .delete) runtime (Time) . 
runtime(Time) statistics (runtime , [_,Time]) 
generate (I ,N) I > N, !. 

generate (I ,N) prime(I), J is I+l, generate(J,N) . 



^ See Section [T2] for the use of such a Variable subclass. 
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This program uses the SICStus CHR handler 

handler prime . 

constraints prime/1. 

prime (I) \ prime (J) <=> J mod I =:= 0 I true. 

runtime measurements were made on a Pentium III PC running SuSE 
Linux 6.2. For problem sizes n = 1000,2000,4000,8000 and 16000, the con- 
straints prime (2) , . . . , prime (n) were generated and processed. Then, the 
constraint prime (2) and its consequences were deleted and the result was 
adapted/re-calculated. For this purpose, in the Java implementation the in- 
terface method delete was used, which is based on repair algorithms pre- 
sented in [18110] . This causes a re-insertion of all constraints on even num- 
bers prime (2fc) 2 < k < n/2 and a re-removal of all these constraints except 
prime (4) . In the SICStus Prolog implementation however, chronological back- 
tracking to the top level and re-processing of the constraints prime (3) , . . . , 
prime (n) was forced, i.e. the equation Phase=delete causes a failure that causes 
backtracking to the second clause of switch. Then, after both kinds of adap- 
tation, the constraint prime (2) was re-inserted. In both cases, this causes a 
removal of the previously re-inserted constraint prime (4) . 

The runtimes for generation and processing show that the purely interpreted 
Java code is about 1.7 times slower than the consulted SICStus Prolog code and 
that the partially compiled Java code (Java version 1.3. in mixed mode) is about 
2.9 times slower than the compiled SICStus Prolog code. 

The runtimes for the deletion of prime (2) show the advantage of adaptation 
over recalculation: the purely interpreted Java code is about 2.6 times faster 
than the consulted SICStus Prolog code, and the partially compiled Java code 
is about 1.5 times faster than the compiled SICStus Prolog code. 

The runtimes for re-addition of prime (2) show that the purely interpreted 
Java code is about 6 times slower than the consulted SICStus Prolog code, and 
that the partially compiled Java code is about 5.6 times slower than the compiled 
SICStus Prolog code. 

Overall, the sums of the runtimes for all these operations are surprisingly 
comparable: Figure [4( a) [ shows that the performance of the interpreted/consulted 
code is nearly identical and that the compiled SICStus Prolog code is on the 
whole marginally faster than the Java code in mixed mode. However, a relative 
comparison of the two chosen adaptation strategies - “repair” and backtracking 
- with re-calculation from scratch is shown in Figure p:(b)[ in Java, the adapta- 
tion is 3-5 times faster than re-calculation; performance increases with problem 
size. Obviously, there is no performance improvement in the SICStus Prolog 
implementation. 

A comparison of our Java implementation of CHR with the one presented 
in H was not considered further. For n = 1000, this implementation takes 
about 1 minute for the generation and processing phase. We assume that the 
interpretation of CHRs rather than their compilation is the reason for this run- 
time. 
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Fig. 4. A benchmark comparison of the prime handler 



4 Applications 

The possibility of arbitrary constraint deletions opens up new application areas 
for CHR in constraint programming. One broad area is local search based on 
simulated annealing; another is back-jumping and dynamic backtracking. One 
application shows how CHR are used in a simple simulated-annealing approach 
to solve the well-known n queens problem. Another application compares chrono- 
logical backtracking, back-jumping and dynamic backtracking in the solution of 
satisfiability problems. 

4.1 Simulated Annealing for the n Queens Problem 

The n queens problem is characterized as follows: place n queens on an n x n 
chessboard such that no queen is attacked by another. One simple solution of 
this problem is to place the n queens (one per row) randomly on the board until 
no queen is attacked. To detect an attack, the following CHR is sufhcientQ 

queendjJ), queen(K,L) ==> I < K, (J == L ; K-I == abs(L-J)) 

I conflictCl, 1.0), conflict(K, 1.0). 

The constraints conflict (i, 1.0) and conflict (fc, 1.0) are derived whenever 
the queens in row/column i/j and k/l are attacking each other: They are either 
in the same column (j = 1) or in the same diagonal {\k — i\ = \l — j\). To detect 
the queens that are “in conflict with” the maximum number of other queens, 
the following CHR sums up these numbers: 

conflict (I ,R) , conflictCl, S) <=> T is R+S I conflictCl, T) . 

The search algorithm to solve the n queens problem is based on a simple 
simulated-annealing approach. An initially given temperature is cooled down 

The semicolon represents the logical “or” (v) in the guard of the CHR. 
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to minimize the total number C of conflicts: = Tq x (0 < p < 1). The 

search stops if either a solution is found or the temperature is below a predefined 
level (Tfe < Tniin). While there are conflicts, a queen that is in conflict with the 
maximum number of other queens is chosen and placed at another randomly 
selected position, i.e. the corresponding constraint queenCf, j) is deleted and a 
new constraint queen is insertedjj If, for the new number D of conflicts, it 

_ D-C 

either holds that D < C or e > S, where 0 < (5 < 1 is a random number, the 
search continues. Otherwise, the moved queen is placed in its original position. 
Using this simple simulated-annealing approach, solutions for 10, 20, 30, ... , 100 
queens problems were easily found (0.5 sec. for 10 and 30 sec. for 100 queens). 

The runtime performance of the implementation is rather poor; it is easily 
outperformed by other approaches. However, the aim of this example was to 
show that adaptive constraint handling with CHR can be used for rapid proto- 
typing of local-search algorithms. These prototypes can be used for education or 
to examine and improve the search algorithm, e.g. the number of search steps 
required to And a solution. 



4.2 Different Search Strategies for SAT Problems 

The SICStus Prolog distributiorJl comes with several CHR handlers and example 
applications. One of these example applications is a SAT(isflability) problem, 
called the Deussen problem ulm027iT. It is the conjunctive normal form of a 
propositional logic formula with 23 Boolean variables. The problem is to And a 
0/1 assignment for all these variables such that the formula, a conjunction of 
Boolean constraints, is satisfied. 

To solve such SAT problems, we coded and compiled the necessary CHRs 
that are part of the Boolean CHR handler in the SICStus Prolog distribution. 
These rules are: 



or(0,X,Y) <=> Y=X. 
or(X,0,Y) <=> Y=X. 
or(X,Y,0) <=> X=0,Y=0. 
or(l,X,Y) <=> Y=l. 
or(X,l,Y) <=> Y=l. 
or(X,X,Z) <=> X=Z. 
neg(0,X) <=> X=l. 
neg(X,0) <=> X=l. 
neg(l,X) <=> X=0. 
neg(X,l) <=> X=0. 
neg(X,X) <=> fail. 



or(X,Y,A) \ or(X,Y,B) <=> A=B. 
or(X,Y,A) \ or(Y,X,B) <=> A=B. 
neg(X,Y) \ neg(Y,Z) <=> X=Z. 

neg(X,Y) \ neg(Z,Y) <=> X=Z. 

neg(Y,X) \ neg(Y,Z) <=> X=Z. 

neg(X,Y) \ or(X,Y,Z) <=> Z=l. 

neg(Y,X) \ or(X,Y,Z) <=> Z=l. 

neg(X,Z) , or(X,Y,Z) <=> X=0,Y=1,Z=1. 

neg(Z,X) , or(X,Y,Z) <=> X=0,Y=1,Z=1. 

neg(Y,Z) , or(X,Y,Z) <=> X=1,Y=0,Z=1. 

neg(Z,Y) , or(X,Y,Z) <=> X=1,Y=0,Z=1. 



We then implemented three different labelling algorithms to solve SAT prob- 
lems. A labelling algorithm is a (systematic) search algorithm that assigns a 
possible value to an unassigned variable - the variable is labelled - until either 



® From time to time the moved queen is arbitrarily chosen, avoiding starvation. 
® See http://www.sics.se/sicstus.html. 
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all variables are assigned and the conjunction of all constraints is satisfied or 
some constraints are violated. If a violation occurs, a labelled variable that has 
an alternative value is selected. The selected variable is re-assigned an alternative 
value. If there is a violation but no labelled variable with an alternative value 
left, then the constraints are inconsistent, i.e. there is no assignment satisfying 
them. 

The implemented labelling algorithms are based on chronological backtrack- 
ing, back-jumping and dynamic backtracking. Search based on chronological 
backtracking and back-jumping assigns the variables systematically in a fixed 
order. In the case of a violation, the last labelled variable is re-assigned if it 
has an alternative value, otherwise the assignments of the some variables are 
“forgotten” (deleted). If the search is based on back-jumping, the recent vari- 
able assignment that justifies the violation and all the following assignments are 
forgotten; in the chronological case, e.g. if the justification is missing, only the 
last variable assignment is forgotten. The search “backtracks” or “back-jumps” 
until the violation is solved or there is no labelled variable left to backtrack or 
to jump to. In the latter case, the problem is unsolvable. During search with 
dynamic backtracking, neither the assignment nor the backtracking is in fixed 
order. If there is a violation and there is no alternative value for the last assigned 
variable, only the recent variable assignment is deleted, which justifies this “dead 
end” of the search process - all other assignments are untouched. A detailed, 
more formal description of all these algorithms is given in [^. 

We implemented search procedures based on back-jumping (DJCHR BJ) 
and dynamic backtracking (DJCHR DBT) for SAT problems using the com- 
piled Boolean CHR handler in Java 1 . 3 . These implementations were compared 
with the search procedure, based on chronological backtracking, that comes with 
the Boolean CHR handler in the SICStus Prolog distribution (SICStus CBT). 
These three search procedures were used to solve the Deussen problem and some 
SAT problems that are available in the Satisfiability Library (SATLIB)0 run- 
time measurements were made on a Pentium HI PC using SICStus Prolog with 
consulted program code and Java 1.3 in mixed mode. 

Table [Ushows the counted numbers of backtracking/back-jumping steps and 
the required runtime in milliseconds, used to find the (first) solution or to detect 
the unsatisfiability of the problem. These runtime experiments show that either 
back-jumping or dynamic backtracking requires less backtracking/back-jumping 
steps than chronological backtracking for the considered problems. Additionally, 
the improved search yields better absolute runtime performance of the Java 
implementations for nearly all the examined benchmarks. 

This application impressively demonstrates the new possibilities offered by 
adaptive constraint handling with CHR: the existence of justifications for all 
derived constraints including false allows high-level implementations of sophis- 
ticated backtracking and search algorithms. 



The whole benchmark set is available online at www.satlib.org. 
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Table 1. Runtime comparison on SATLIB benchmark problems (except ulm027rl). 



SATLIB 


number of 


SICStus CBT 


DJCHR BJ 


DJCHR DBT 


benchmark problems 


solutions 


steps 


msec. 


steps 


msec. 


steps 


msec. 


Deussen ulm027rl 


16 


52 


250 


36 


939 


72 


1524 


The Pigeon Hole 6 


0 


14556 


20950 


3646 


17837 


1452121 


19384972 


aim-50-2_0-yes 1-1 


1 


11110 


61310 


552 


6062 


5178 


47850 


aim-50-2 O-yesl-2 


1 


384 


2000 


154 


2519 


90 


1805 


aim-50-2_0-yesl-3 


1 


34088 


168180 


301 


3340 


978 


15951 


aim-50-2_0-yesl-4 


1 


302 


2160 


123 


2540 


167 


3416 


aim-50-2 0-no-l 


0 


906558 


1706830 


44492 


429141 


17697 


184587 


aim-50-2_0-no-2 


0 


70266 


415340 


944 


13340 


25528 


418031 


aim-50-2_0-no-3 


0 


172150 


674910 


46526 


483830 


295792 


3817240 


aim-50-2 O-no-4 


0 


53874 


236130 


198 


4298 


5689 


85381 



5 Conclusions and Future Work 

The adaptive CHR system outlined in this paper was implemented over a six 
months period. The implemented system is the first system to combine recent 
developments in CHR implementation with dynamic constraint solving. More 
specifically, the number of constraints in CHR’s heads is no longer limited to 
two, and rational trees of attributed variables are used to implement efficient ac- 
cess to the constraint store, especially during the partner search. Furthermore, 
arbitrary constraint additions and deletions are fully supported: constraint pro- 
cessing is automatically adapted. This opens up new areas in constraint pro- 
gramming for CHR. Three of these are now implemented: simulated annealing 
and adaptive search with back-jumping or dynamic backtracking. For the future, 
interactive diagrammatic reasoning with CHR is planned as well as the applica- 
tion of other “fancy backtracking” algorithms on harder SAT problems, e.g. all 
the AIM instances (c.f. [12]) will be examined and discussed. 

Other future activities will concentrate on the compiler in order to produce 
highly optimized code. Besides general improvements like early guard evaluation 
and the avoidance of code generation and processing, there are improvements 
of the adaptation process. This will make it possible to avoid re-processing of 
constraints that are removed by rule applications and later re-activated by un- 
doing these applications during adaptation. In some cases, it is correct and more 
efficient to put them directly back in the constraint store rather than activate 
them. This holds for removed constraints that would not have been re-activated 
by a later wake-up even if they would not have been removed. 
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Abstract. One of the most powerful techniques for solving centralized constraint 
satisfaction problems (CSPs) consists of maintaining local consistency during 
backtrack search (e.g. II il l. Yet, no work has been reported on such a combi- 
nation in asynchronous setting^]. The difficulty in this case is that, in the usual 
algorithms, the instantiation and consistency enforcement steps must alternate se- 
quentially. When brought to a distributed setting, a similar approach forces the 
search algorithm to be synchronous in order to benefit from consistency main- 
tenance. Asynchronism 1241 1411 is highly desirable since it increases flexibility 
and parallelism, and makes the solving process robust against timing variations. 
One of the most well-known asynchronous search algorithms is Asynchronous 
Backtracking (ABT). This paper shows how an algorithm for maintaining con- 
sistency during distributed asynchronous search can be designed upon ABT. The 
proposed algorithm is complete and has polynomial-space complexity. Since the 
consistency propagation is optional, this algorithms generalizes forward check- 
ing as well as chronological backtracking. An additional advance over existing 
centralized algorithms is that it can exploit available backtracking-nogoods for in- 
creasing the strength of the maintained consistency. The experimental evaluation 
shows that it can bring substantial gains in computational power compared with 
existing asynchronous algorithms. 



1 Introduction 

Distributed constraint satisfaction problems (DisCSPs) arise when constraints and/or 
variables come from a set of independent but communicating agents. Successful cen- 
tralized algorithms for solving CSPs combine search with local consistency. Most local 
consistency algorithms prune from the domains of variables the values that are locally 
inconsistent with the constraints, hence reducing the search space. When a DisCSP is 
solved by distributed search, it is desirable that this search exploits asynchronism as 
much as possible. Asynchronism gives the agents more freedom in the way they can 
contribute to search, allowing them to enforce individual policies (on privacy, computa- 
tion, etc.). It also increases both parallelism and robustness. In particular, robustness is 
improved by the fact that the search can still detect unsatisfiability even in the presence 
of crashed agents. Existing work on asynchronous algorithms for distributed CSPs has 
focused on one of the following types of asynchronism: 

' A preliminary version of this paper has been presented at the CP2000 Workshop on Distributed 
CSPsITfil 

T. Walsh (Ed.): CP 2001, LNCS 2239, pp. 271-12851 2001. 

© Springer- Verlag Berlin Heidelberg 2001 
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a) deciding instantiations of variables by distinct agents. The agents can propose dif- 
ferent instantiations asynchronously (e.g. Asynchronous Backtracking (ABT) [|24|]). 

b) enforcing consistency. The distributed process of achieving “local” consistency on 
the global problem is asynchronous (e.g. Distributed Arc Consistency I25H ). 

Combining these two techniques is however not as easy as in the synchronous setting. A 
straightforward mapping of the existing combination scheme cannot preserve asynchro- 
nism of type a Em . The contribution of this work is to consider consistency mainte- 
nance as a hierarchical nogood-based inference. This makes it possible to concurrently i ) 
perform asynchronous search and ii) enforce the hierarchies of consistency, resulting in 
an asynchronous consistency maintenance algorithm. Since the consistency propagation 
is optional, this algorithms generalizes forward checking as well as chronological back- 
tracking. More general than existing centralized algorithms, our approach can use any 
available backtracking nogoods to increase the strength of the maintained consistency. 
As expected from the sequential case, the experiments show that substantial gains in 
computational power can result from combining distributed search and distributed local 
consistency. 

2 Related Work 

The first complete asynchronous search algorithm for DisCSPs is the Asynchronous 
Backtracking (ABT) l2^ . The approach in l2^ considers that agents maintain distinct 
variables. Nogood removal was discussed in Em. Other dehnitions of DisCSPs have 
considered the case where the interest on constraints is distributed among agents l25l 
I20I14I7I5H . Ei proposes algorithms that fit the structure of a real problem (the nurse 
transportation problem). The Asynchronous Aggregation Search (AAS) f T4|| family of 
protocols actually extends ABT to the case where the same variable can be instantiated 
by several agents (e.g. at different levels of abstraction I12I16H ). An agent may also 
not know all constraint predicates relevant to its variables. AAS offers the possibility to 
aggregate several branches of the search. An aggregation technique for DisCSPs was then 
presented in lUOll and allows for simple understanding of privacy/efficiency mechanisms, 
also discussed in III- The use of abstractions, EH), not only improves on efficiency 
but especially on privacy since the agents need to reveal less their details. A general 
polynomial space reordering protocol is described in fTTl and several heuristics (e.g. 
weak commitment-like) are discussed in fT8|] . Q explains how add-link messages can 
be avoided. A technique enabling parallelization and parallel proposals in asynchronous 
search is described in 1T91I . Several algorithms for achieving distributed arc consistency 
are presented in S\25tl\ . 

3 Preliminaries 

In this paper we target problems with hnite domains (we target problems with numeric 
domains in II12I16II '). For simplicity, but here without loss of generality, we consider that 
each agent Ai can propose instantiations to exactly one distinct variable, Xi and knows all 
the constraints that involve Xi. Therefore each agent, Ai, knows a local CSP, CSP(Ai), 
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Fig. 1 . Distributed search trees in ABT: simultaneous views of distributed search seen by A2, A3, 
and A4, respectively. Each arc corresponds to a proposal from Ai to Aj. Circles show the believed 
state of an agent. Dashed circle and line show known state that may have been changed. 

with variables vars(Ai). We present the way in which our technique can be built on ABT, 
a simple instance of AAS for certain timings and agent strategies, but it can be easily 
adapted to more complex frameworks and extensions of AAS. ABT allows agents to 
asynchronously propose instantiations of variables. In order to guarantee completeness 
and termination, ABT uses a static order ^ on agents. In the sequel of the paper, we 
assume that the agent Ai has position i,i > 1, when the agents are ordered according 
to If i>j then Ai has a lower priority than Aj and Aj has a higher priority then AiEl 
Ai is then a successor of Aj, and Aj a predecessor of Ai. 

Asynchronous distributed consistency: Most centralized local-consistency algorithms 
prune from the domain of variables the values that are locally inconsistent with the con- 
straints. Their distributed counterparts (e.g. [|25l ) work by exchanging messages on value 
elimination. The restricted domains resulting from such a pruning are called labels. In 
this paper we will only consider the local consistencies algorithms which work on labels 
for individual variables (e.g. arc-, bound-consistency). Let P be a Distributed CSP with 
the agents Ai,i€{l..n}. We denote by C{P) the CSP defined by „,}CSP(Ai)H 

Let A be a centralized local consistency algorithm as just mentioned. We denote by 
DC(A) a distributed consistency algorithm that computes, by exchanging value elimi- 
nations, the same labels for P as A for C'(P). When DC(A) is run on P, we say that P 
becomes DC(A) consistent. Generic instances of DC(A) are denoted by DC. Typically 
with DC im, the maximum number of generated messages is a^vd and the maximum 
number of sequential messages is vd (timumber of variables, d:domain size, amumber 
of agents). 

4 Asynchronous Consistency Maintenance 

In the sequential/synchronous setting, the view of the search tree expanded by a consis- 
tency maintenance algorithm is unique. Each node at depth k, corresponds to assigning 
to the variable Xk a value Vi from its label. Initially the label of each variable is set to 
its full domain. After each assignment Xk=Vi, a local consistency algorithm is launched 
which computes for the future variables the labels resulting from this assignment. 

^ They can impose first eventual preferences they have on their values. 

^ The union of two CSPs, Pi and P2, is a CSP containing all the constraints and variables of Pi 



and P2. 
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In distributed search (e.g. ABT), each agent has its own perception of the distributed 
search tree. Its perception on this tree is determined by the proposals received from its 
predecessors. In Figure[T]is shown a simultaneous view of three agents. Only A 2 knows 
the fourth proposal of Ai. A 3 has not yet received the third proposal of A 2 consistent 
with the third proposal of Ai. However, A 4 knows that proposal of A 2 . In Figure [1] 
we suppose that A 4 has not received anything valid from A 3 (e.g. after sending some 
nogood to A 3 which was not yet received). The term level in Figure|T]refers to the depth 
in the (distributed) search tree viewed by an agent. 

Let P be a Distributed CSP with the agents Ai, iG{l..n}, ^ be a centralized local 
consistency algorithm and DC(^) one of its distributed counterparts. Suppose that the 
instantiation order of the variables in C(P) is determined by the order of the agents in 
P. In order to guarantee that with DC(^) one maintains for the variables of agents Ai 
of P the same labels, £, than with A in C{P), one can simply impose that: 

1 . Ai must have received the proposals of all its predecessors before launching DC(^), 

2. Ai cannot make any proposal with values outside C, computed by DC(yl). 

This approach EM is synchronous. Alternatively, we propose to handle consistency 
maintenance as a hierarchical task. We show that Ai can then benefit from the value 
eliminations resulting from the proposals of subsets of its predecessors, as soon as avail- 
able. More precisely, if Ai has received proposals from some of its k first predecessors, 
we say that it can benefit from value elimination (nogoods) of level k. Such nogoods 
are determined by instantiations of xt, t<k (known proposals), DC process at level k or 
inherited from DCs at previous levels along the same branch. A DC process of level k is 
a process which only takes into account the known proposals of the k first agents. The 
resulting labels are said to be of level k. When the nogoods defining labels are classified 
according to their corresponding levels, and when they are coherently managed by agents 
as shown here, the instantiation decisions and DCs of levels k can then be performed 
asynchronously for different k with polynomial space complexity and without loosing 
the inference power of DC(_4). Moreover, backtrack-nogoods involving only proposals 
from agents can be used by DC at level k. Since the use of most nogoods is 

optional, many distinct algorithms result from the employment of different strategies by 
agents. 

5 The DMAC-ABT Protocol 

This section presents DMAC-ABT (Distributed Maintaining Asynchronously Consis- 
tency for ABT), a complete protocol for maintaining asynchronously consistency. Since 
it builds on ABT, we start by recalling the necessary background and definitions. 

5.1 ABT 

In asynchronous backtracking, the agents run concurrently and asynchronously. Each 
agent instantiates its variable and communicates the variable value to the relevant agents. 
As described for AAS lfl4l . since we do not assume (generalized) FIFO channels, in the 
polynomial-space requirements description given here a local counter, C* , in each 
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agent Ai is incremented each time a new instantiation is chosen. The current value of 
tags each assignment made by Ai for Xi. 

Definition 1 (Assignment). An assignment /or a variable Xi is a tuple {xi, v, c) where 
V is a value from the domain of Xi and c is the tag value (value of Cl..). 

Among two assignments for the same variable, the one with the higher tag (attached 
value of the counter) is the newest. 

Rule 1 (Constraint-Evaluating-Agent) Each constraint C is evaluated by the lowest 
priority agent whose variable is involved in C. This agent is denoted CEA( C ). 

The set of constraints enforced by Ai are denoted ECSP(Aj) and the set of variables 
that are involved in ECSP(Ai) is denoted evars(Aj), where a;iSevars(Ai). Each agent 
holds a list of outgoing links represented by a set of agents. Links are associated with 
constraints. ABT assumes that every link is directed from the value sending agent to the 
constraint-evaluating-agent. 

Definition 2 (Agent_View). The agent_view of an agent, Ai, is a set, view(Ai), contain- 
ing the newest assignments received by Ai for distinct variables. 

Based on their constraints, agents perform inferences concerning the assignments in 
their agent jview. By inference the agents generate new constraints called nogoods. 

Definition 3 (Explicit Nogood). An explicit nogood has the form ~^N where N is a set 
of assignments for distinct variables. 

The following types of messages are exchanged in ABT: 

- ok? message transporting an assignment is sent to a constraint-evaluating-agent to 
ask whether a chosen value is acceptable. 

- nogood message transporting an explicit nogood. It is sent from the agent that infers 
an explicit nogood ^N, to the constraint-evaluating-agent for ~^N. 

- add-link message announcing Ai that the sender Aj owns constraints involving Xi. 
Ai inserts Aj in its outgoing links and answers with an ok?. 

The agents start by instantiating their variables concurrently and send ok? messages 
to announce their assignment to all agents with lower priority in their outgoing links. 
The agents answer to received messages according to the Algorithm [T] (given in ll.3in . 

Definition 4 (Valid assignment). An assignment {x,vi,ci) known by an agent Ai is 
valid for Ai as long as no assignment {x, V 2 , cf), C 2 >Ci, is received. 

A nogood is valid if it contains only valid assignments. The next property is a 
consequence of the fact that ABT is an instance of AAS. 

Property 1 If only one valid nogood is stored for a value then ABT has polynomial space 
complexity in each agent, 0(dv), while maintaining its completeness and termination 
properties, d is the domain size and v is the number of variables. 
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when received ( ok?, {xj , dj ,Cx^}) do 
if(old Cxj ) return; 
ddd(xj,dj,Cxj) to agent. view, 
eliminate invalidated nogoods; 
check_agent_view; 
when received lnogood,7lj,-nAfj do 

when any {x, d, c) in N is invalid (old c) then 
send (ok?, {xi, current. value, Cx-)) to Aj', 
return; 

when (xk, dk,Ck), where Xk is not connected, is contained in -iN 
send add-link to 21^; 
add (xk, dk,Ck) to agent. view, 
when {xi,d, c)£N, then put ~^N in nogood-list for Xi=d,\ 
add other new assignments to agent. view, 

1.1 eliminate invalidated nogoods; 
old.value ^ current. value\ 
check_agent_view; 

when old.value = current. value 

1.2 send (ok?,(a;i, current. value, Cx.)} to Aj\ 

procedure check_agent_view do 

when agent_view and current_value are not consistent 
if no value in Di is consistent with agent_view then 
I backtrack; 
else 

select d £ Di where agent. view and d are consistent; 

current.value ■£- d; Cx.++', 

send (ok?,(a;i, d, Cx ^ )) to lower priority agents in outgoing links', 

procedure backtrack do 

nogoods ■<— {(7 I (/ = inconsistent subset of agent.view}', 
when an empty set is an element of nogoods 

broadcast to other agents that there is no solution, terminate this algorithm; 
for every V € nogoods', 

select (xj,dj,c) where Xj has the lowest priority in V', 
send (nogood, to Aj', 
eliminate invalidated explicit nogoods; 
remove (Xj,dj,c) from agent. view, 

check_agent_view; 

Algorithm 1: Procedures of Ai for receiving messages in ABT with nogood removal. 



5.2 DMAC-ABT 



Parts of the content of a message may become invalid due to newer available information. 
We require that messages arrive at destination in finite time after they are sent. The 
receiver can discard the invalid incoming information, or can reuse invalid nogoods with 
alternative semantics (e.g. as redundant constraints). 
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when receivedfpropagate.Ajjfc.c^^ (j),V^(xv0)) do 

2.1 when have higher tag (j, i) (j ) then return; 

Cx„ (,7i i) c*„ (j); when any (x. d, c) in V is invalid (old c) then return; 
when {xu, du, Cu), where is not connected, is contained in V 
send add-link to Au', add {xu,du, Cu) to agent.view, 

2.2 add other new assignments in V to agent.view, eliminate invalidated nogoods; 

{i, j) ■«— {V— >-(x„01)}; maintain_consistency(minimal level that is modified); 
check_agent_view; //only satisfies consistency nogoods of levels t, t<cLj; 
procedure maintain.consistency/m/nT') do 

if (minT > cLi) then return; //cLj is the current inconsistent level (initially i-il); 

2.3 for (tt— minT; t<i; t-l-l-) 

2.4 new-cn consistency nogood for Xi after local consistency on 
when (domain wipe out hy computing the explicit nogoods AC) 

for every V € AC; 

select {xj , dj , Cxj ) where xj has the lowest priority in V ; 

send (nogood, Ai,y) to Aj; eliminate invalidated explicit nogoods; 

remove {xj , dj , Cxj ) from agent. view, 

2.5 cLi ^t; break; 

when new-cn shrinks label of Xi (obtained from Uk<tcnx. (i, i)) 

2.6 crix.{i,i) <r- new-cn; Cx.-v-v; 

send (propagate,Aj,k,C^.,new-cw) to agents Aj,j>t,Xi E evars(Aj); 

Algorithm 2: Procedure ofAifor receiving propagate messages in DMAC-ABT. 



In addition to the messages of ABT, the agents in DMAC-ABT may exchange in- 
formation about nogoods inferred by DCs. This is done using propagate messages as 
shown in Algorithmic Before making their first proposal as in ABT, cooperating agents 
can start with a call to maintain_consistency(0). 

Definition 5 (Consistency nogood). A consistency nogood for a level k and a variable 
X has the form V^{xEIx) orV^^{xEs\ff). V is a set of assignments. Any assignment 
in V must have been proposed by or its predecessors, is a label, (^7^0. g is the 
initial domain o/x0 

The propagate messages for a level k are sent to all agents At, i>k, XiCevars(Ai). 
They take as parameters the reference fc of a level and a consistency nogood. Each 
consistency nogood for a variable Xi and a level k is tagged with the value of a counter 
C^. maintained by the sender. The agents Aj use the most recent proposals of the agents 
Aj,j<k when they compute DC consistent labels of level k. Aj may receive valid 
consistency nogoods of level k with assignments for the set of variables V, V not in 
evars(Ai). Ai must then send add-link messages to all agents k'<k not yet linked 
to Aj and owning variables in V. In order to achieve consistencies asynchronously, 
besides the structures of ABT, implementations can maintain at any agent Aj, for any 
level k, k<i: 

- The set, V)!, of the newest valid assignments proposed by agents Aj,j<k, for each 
interesting variable. 

Or a previously known label of x (for AAS). 
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when received fok?, {xj , dj ,Cx^)) do 
if(old Cxj ) return; 

3.1 add(x j,dj,Cxj ) to agent.view, eliminate invalidated nogoods; 

maintain_consistency(j); 

check_agent_view; //only satisfies consistency nogoods of levels t, t<cLi; 

procedure check_agent_view do 

when agent_view and current, value are not consistent //cf. nogoods of levels t, t<cLi 
if no value in Dj is consistent with agent_view then 
I backtrack; 
else 

select d E Di where agent. view and d are consistent; 
current. value E- d; C* .++; maintain_consistency(i); 
send (ok?,(®i, d, Cx. )) to lower priority agents in outgoing links’. 



Algorithm 3: Procedures of Aifor receiving ok? messages in DMAC-ABT. 



- For each variable x, xGvars(Ai), for each agent Aj , j>k, the last consistency nogood 
(with highest tag) sent by Aj for level k, denoted cn^{i,j). cn^{i, j) is stored only 
as long as it is valid. It has the form 

NVi(Vjj.) is the constraint of coherence of Ai with the view V^. Let cn^(/, .) be 

{u\fVlx)^{xEn\fslx)’ m) ■■= CSP(^.) U (U,cn^(*,.)) UNV,(P,0 U CL^ 

is incremented on each modification of cn^.{i, i) (line 2.6). 

On each modification of Pi{k), cn^\i,i) is recomputed by inference (e.g. using 
local consistency techniques at line 2.4) for the problem Pi{k). cn^\i, i) is initialized 
as an empty constraint set. CL], is the set of all nogoods known by Ai and having the 
form V^C where PCV)! and C is a constraint over variables in vars(Tli). cn^.{i, i) is 
stored and sent to other agents by propagate messages iff its label shrinks and either 
CSP(Tli) or CL\ was used for its logical inference from Pi{k). This is also the moment 
when Cx- is incremented. The procedure for receiving propagate messages is given in 
Algorithm El 

We now prove the correctness, completeness and termination properties of DMAC- 
ABT. We only use DC techniques that terminate (e.g. EUzi). By quiescence of a group 
of agents we mean that none of them will receive or generate any valid nogoods, new 
valid assignments, propagate or add-link messages. 

Property 2 In finite time f* either a solution or failure is detected, or all the agents 
Ai, 0<i<i reach quiescence in a state ’where they are not refused a proposal satisfying 
ECSP(Aj)UNVj(view(Aj-)). 

Proposition 1. DMAC-ABT is correct, complete and terminates. 

The proof is given in Annexes. It remains to show the properties of the labels com- 
puted by DMAC-ABT at each level of the distributed search tree. If the agents, using 
DMAC-ABT, store all the valid consistency nogoods they receive, then DCs in DMAC- 
ABT converge and compute a local consistent global problem at each level (each pair 
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initiaLconstraint-variableJabel is checked by some agent). If on the contrary, the agents 
do not store all the valid consistency nogoods they receive but discard some of them 
after inferring the corresponding cn* then some valid bounds or value eliminations 
can be lost when a cn^(i, i) is invalidated. Different labels are then obtained in different 
agents for the same variable. These differences have as result that the DC at the given 
level of DMAC-ABT can stop before the global problem is DC consistent at that level. 

Among the consistency nogoods that an agent computes itself at level k from its 
constraints, cn^{i, i), let it store only the last one for each variable and only as long as it 
is valid. Let Ai also store only the last (with highest tag) consistency nogood, cn^{i,j), 
sent to it for each variable a:Gvars(Ai) at each level k from any agent Aj. cn'^{i,j) is 
also stored only as long as it is valid. Each agent stores the highest tag c^(j) for each 
variable x, level k and agent Aj that sends labels for x. Then: 

Proposition 2. DC(A) labels computed at quiescence at any level using propagate 
messages are equivalent to A labels when computed in a centralized manner on a 
processor. This is true whenever all the agents reveal consistency nogoods for all minimal 
labels, l^, which they can compute and when CLj^. are not used. 

Proof. In each sent propagate message, the consistency nogood for each variable is 
the same as the one maintained by the sender. By checking (j) at line 2.1, the stored 
consistency nogoods are coherent and are invalidated only when newer assignments are 
received (event that is coherent) at lines 1.1, 2.2, 3.1. Any assignment invalid in one agent 
will eventually become invalid for any agent. Therefore, any such nogood is discarded at 
any agent, iff it is also discarded at its sender. The labels known at different agents, being 
computed from the same consistency nogoods, are therefore identical and the distributed 
consistency will not stop at any level before the global problem is local consistent in 
each agent. □ 

Since consistency nogoods are not discarded when nogoods are sent to agents gen- 
erating their assignments, asynchronism is ensured by temporarily disregarding those 
consistency nogoods. In AlgorithmElwe only satisfy consistency nogoods at levels lower 
than the current inconsistent level, cL^ (see line 2.5 in Algorithm^. Alternatively, such 
consistency nogoods could be discarded but then, to ensure coherence of labels, agents 
receiving any nogood should always broadcast assignments with new tags and many 
nogoods would be unnecessarily invalidated. 

ABT may deal with problems that require privacy of domains. For such problems, 
agents may refuse to reveal labels for some variables, especially since the initial labels 
at level 0 are given by the initial domains. The strength of the maintained consistency 
is then function of how many such private domains are involved in the problem. The 
DisCSPs presenting only privacy on constraints, and the corresponding versions and 
extensions of ABT, suffer less of this problem. 

Proposition 3. The minimum space an agent needs with DMAC-ABT for ensuring main- 
tenance of the highest degree of consistency achievable with DC is 0(v^{v -\- d)). With 
bound consistency, the required space is 0{v^). 



The proof is given in Annexes. 
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procedure maintain_consistencyfmin7’j do 

if (minT > cLi) then return; 

4.1 for (t<— minT; t<i; t++) 

new-cns ^ consistency nogoods for all vars(2li) after local consistency on Pi(t)\ 
when (domain wipe out by computing explicit nogoods nogoods) 
for every V £ no goods’, 

select {xj , dj , Cx^ ) where Xj has the lowest priority in V ; 

4.2 send (nogood, to Aj ; eliminate invalidated explicit nogoods; 
cLi t; remove {xj, dj, Cx^) from agent. view, 

break; 

forall new-cn consistency nogood for any variable Xu in new-cns 

when new-cn shrinks label of Xu (obtained from (*,«')) 

(*> 0 ^ new-cn-, c‘ „ (i)++; 

send (propagate.Ai.fc^^.new-cM) to agents Aj,j>t, Xu£vars(Aj); 
Algorithm 4: Procedure of Aifor receiving propagate messages in DMAC-ABTl. 



5.3 Using Available Valid Nogoods in Pi{k) for Maintaining Consistency 
(DMAC-ABTl) 

In Algorithm[2l an agent Aj only sends consistency nogoods for the variable Xj . However, 
when the local consistency is computed for ( A:) , new labels are also computed for other 
variables known by Aj. 

If in Pi{k) we only use consistency nogoods and initial constraints, the final result 
of the consistency maintenance is coherent in the sense that at quiescence at any given 
level, each agent ends knowing the same label for each variable. Namely the new label 
obtained by Aj for some variable Xu will be computed and sent by after receiving 
the other labels in consistency nogoods and instantiations that Aj knows and are related 
to Xu ■ 

We propose that agents can use in their Pi{k) valid explicit nogoods that they have 
received by nogood messages or old and invalidated consistency nogoods stored as 
redundant constraints. In this last case the labels obtained with Algorithm |2] are no 
longer minimal since an agent does not know all constraints that can be used by Aj 
locally for computing its version of the label of Xu at level k. 

In Algorithm!?] we present a version of DMAC-ABT that we call DMAC-ABTl. In 
DMAC-ABTl, Ai can send consistency nogoods for all variables found in CSP(Ai). The 
space complexity for storing the last tags for the consistency nogoods at all levels and 
coming from all other agents is now 0(v^) and for DMAC-ABTl the space complexity 
is 0{v^{v -F d)). However, the power of DCs is increased since it can accommodate any 
available nogood. The number of sequential messages is also reduced since there is no 
need to wait for to receive the label of Xi before reducing the label of Rather Aj 
propagates itself the label of . 

Proposition 4. The minimum space an agent needs with DMAC-ABTl for ensuring 
maintenance of the highest degree of consistency achievable with DC is 0(v^{v -F d)). 
With bound consistency, the required space is 0{v'^). 
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1: ok?(*i, 1, 1) , Aa 

2: A 2 _propagate(A2,0,l,a;3 ^ {2})_^ Ai 
3: A 2 _propagate(A2,0,l,X3 ^ {2})_^ A3 

4: A2 ok?(x2, 2, 1) , A3 

5: Ai _propagate(Ai,0,l,xi ^ {1})-+ A3 

6: Ai ok?(xi, 2, 2) , A3 

7: A3 _propagate(A3,0,l,a;i ^ {1})-^ Ai 
8: A3 nogood^((a;i, 1, 1)^ Ai 



Fig. 2. Simplified example for DMAC-ABTl. Function of the exact timing of the network, some 
of these messages are no longer generated. Only 2 messages are sequential (half round-trips). ABT 
needs 4 sequential messages (half round-trips) for the same example (see Il23l l. 



The proof is given in Annexes. We denote by DMAC-ABT2 the version of DMAC- 
ABT where any agent Ai can compute, send and receive labels for variables constrained 
by their stored nogoods and redundant constraints but not found in vars(Ai). 

6 Example 

In Figure m we show a trace of DMAC-ABTl for the example described in (221 ■ Before 
making its proposal, A2 sends propagate messages to announce the consistency nogood 
X 3 ^ {2} of level 0, tagged with c°^(2) = 1. These propagate messages are sent both 
to Ai and A3. Ai sends an ok? message proposing a new instantiation. 

A3 (and Ai when the domain of 0:3 is public) compute both the consistency nogood 
Xi ^ {1} at level 0. A3 computes an explicit nogood from consistency at level 1 and 
sends it to Ai. This nogood is invalid since Ai has already changed its instantiation 
(and a small modification of DMAC-ABTl, for simplicity not given here, can avoid 
sending it). Then solution and quiescence are reached. The longest sequence of messages 
valid at their receivers (length 2) consists in messages 2,6. The worst case timing (slow 
communication channel from A2 to Ai or privacy for the domain of X 3 ) gives the longest 
sequence 3,7,6 (5 would not be generated). The fact that ABT (as well as any synchronous 
algorithm) would require at least 4 sequential messages illustrates the parallelism offered 
by asynchronous consistency maintenance. 

7 Experiments 

We have presented here DMAC-ABTl, an algorithm that allows to maintain consistency 
in ABT. ABT was chosen since it is simpler to present and explain. Recently we have 
presented an extension of ABT that allows several agents to propose modifications to the 
same variable and allows agents to aggregate values in domains. That extension is called 
Asynchronous Aggregation Search (AAS) Il4ll . In II14I is shown that the aggregations 
bring to ABT improvements of an order of magnitude for versions that maintain a 
polynomial number of nogoods. Here it is therefore appropriate to test the improvements 
that our technique for maintaining consistency brings to AAS. The version of DMAC- 
ABTl for AAS is denoted DMAC. 
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Fig. 3. Results averaged over 500 problems per point. 



We have run our tests on a local network of SUN stations where agents are placed 
on distinct computers. We use a technique that enables agents to process with higher 
priority propagate and ok? messages for lower levels. 

The DC used in our experimental evaluation maintains bound-consistency. In each 
agent, computation at lower levels is given priority over computations at higher levels. 
We generated randomly problems with 1 5 variables of 8 values and graph density of 20% . 
Their constraints were randomly distributed in 20 subproblems for 20 agents. Figure 0 
shows their behavior for variable tightness (percentage of feasible tuples in constraints), 
averaged over 500 problems per point. We tested two versions of DM AC, A 1 and A2. A1 
asynchronously maintains bound consistency at all levels. A2 is a relaxation where agents 
only compute consistency at levels where they receive new labels or assignments, not 
after reduction inheritance between levels. A2 is obtained in Algorithm[5|by performing 
the cycle starting at line 4.1 only for t = k, where k is the level of the incoming 
ok? or propagate message triggering it. In both cases, the performance of DMAC is 
significantly improved compared to that of AAS. Even for the easy points where AAS 
requires less than 2000 sequential messages, DMAC proved to be more than 10 times 
better in average. A2 was slightly better than A1 on average (excepting at tightness 15%). 
In these experiments we have stored only the minimal number of nogoods. The nogoods 
are the main gain of parallelism in asynchronous distributed search. Storing additional 
nogoods was shown for AAS to strongly improve performance of asynchronous search. 
As future research topic, we foresee the study of new nogood storing heuristics 118 1241 
1221 18161 . 

8 Conclusion 



Consistency maintenance is one of the most powerful techniques for solving central- 
ized CSPs. Bringing similar techniques to an asynchronous setting poses the problem 
of how search can be asynchronous when instantiation and consistency enforcement 
steps are combined. We present a solution to this problem. A distributed search protocol 
which allows for asynchronously maintaining distributed consistency with polynomial 
space complexity is proposed. DMAC-ABT builds on ABT, the basic asynchronous 
search technique. However, DMAC-ABT can be easily integrated into more complex 
versions of ABT (combining it with AAS and using abstractions lfl6l . one can use com- 
plex splitting strategies (TtII to deal efficiently with numeric DisCSPs ifTSll l. Another 
original feature of DMAC is its capability of using backtrack nogoods to increase the 
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strength of the maintained consistency]^ The experiments show that the overall perfor- 
mance of asynchronous search with consistency maintenance is significantly improved 
compared to that of asynchronous search that does not maintain consistency. 



Annexes (Proof) 

Property 2 In finite time either a solution or failure is detected, or all the agents 
At, 0 < 7 <i reach quiescence in a state where they are not refused a proposal satisfying 
ECSP(A^)UNVj(view(A^-)). 

Proof. The proof is by induction on i. Let this be true for the agents Aj,j<i. Let r 
be the maximum time taken by a message. After f‘~^ + t, Ai no longer receives ok? 
messages. Ai receives the last valid ok? message at time + T. 3tl, -f T>tl 

such that after view(Ai) and all fc<z of any agent Au are no longer modified. 
The set of disabled tuples in CL^, k<i can contain only a bounded number of elements 
for each agent and they cannot be invalidated after f*. CL^, k<i cannot be invalidated 

after . Since DCs were assumed to terminate, they terminate after each modification 
of a CL^ . Since the number of such modifications that can generate a new consistency 
nogood after f® is bounded, after a finite time no consistency nogood is received any 
longer by Ai for levels k<i. 

Since the domains are finite, Ai can make only a finite number of different proposals 
satisfying view(Ai). Once any of them is sent, the total number of consistency nogoods 
that can be received before the proposal is modified is finite (this results by induction 
to levels k<i of the reasoning for k<i in the previous paragraph since after vt, Ai 
can receive only valid nogoods: valid explicit nogoods trigger the modification of the 
instantiation of Ai so that they can arrive only in finite time; if valid explicit nogoods 
are not received and no instantiation modification is done in finite time, no ok? is sent 
any longer by Ai, and the number of valid consistency nogoods at level i is limited as 
in the previous paragraph). 

Only one valid explicit nogood can be received for a proposal since the proposal is 
immediately changed on such an event. Invalid nogoods can be received only within vt 
time delay after a proposal is made. Therefore, there is a finite number of nogoods that 
can be received by Ai for any of its proposals made after fj, (and after f® ). 

1 . If one of the proposals is not refused by incoming nogoods, and since the number 
of received nogoods is finite, the induction step is correct. 

2. If all proposals that Ai can make after are refused or if it cannot find any 
proposal, Ai has to send according to rules Inherited from ABT a valid explicit nogood 

to somebody. is valid since all the assignments of Ak,k < i were received at 
Ai before f® . 

2. a) If N is empty, failure is detected and the induction step is proved. 

2.b) Otherwise ^iV is sent to a predecessor Aj,j<i. Since ~^N is valid, the proposal 
of Aj is refused, but due to the premise of the inference step, Aj either 

2.b.i) finds an assignment and sends ok? messages or 



^ Since this paper was submitted, m presents an algorithm reusing some backtrack nogoods in 
MAC. That algorithm can be proven to behave as a centralized instance of DMAC. 
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2.b.ii) announces failure by computing an empty nogood (induction proven). 

In the case (i), since was generated by Ai, Ai is interested in all its variables, and 
it will be announced by Aj of the modification by an ok? messages. 

Case 2.b.i contradicts the assumption that the last ok? message was received by Ai at 
time tg and the induction step is therefore proved for all alternative cases. The property 
can be attributed to an empty set of agents and it is therefore proved by induction for all 
agents. □ 



Proposition 1. DMAC-ABT is correct, complete and terminates. 

Proof. Completeness: All the nogoods are generated by logical inference from existing 
constraints. Therefore, if a solution exists, no empty nogood can be generated. 

No infinite loop: The result follows from Property [2l 

Correctness: All valid proposals are sent to all interested agents and stored there. At 
quiescence all the agents know the valid interesting assignments of all predecessors. If 
quiescence is reached without detecting an empty nogood, then all the agents agree with 
their predecessors and their intersection is nonempty and correct. □ 



Proposition 3. The minimum space an agent needs with DMAC-ABT for ensuring main- 
tenance of the highest degree of consistency achievable with DC is 0(v^{v + d)). With 
bound consistency, the required space is 0{v^). 

Proof, d-maximal domain size;u-number of variables. The space required for storing all 
valid assignments is 0(u) for values and 0{v) for the corresponding counters. The agents 
need to maintain at most v levels, each of them dealing with maximum v variables, for 
each of them having at most 1 last consistency nogood. Each consistency nogood refers 
at most V assignments in premise and stores at most d values in label. The stack of labels 
requires therefore 0{v'^{v + d)). The space required by the algorithm for solving the 
local problem depends on the corresponding technique (e.g. chronological backtracking 
requires 0(u)). The stored explicit nogoods require 0{dv) as mentioned in Property[TJ 
In DMAC-ABT are also stored 0(v^) tags for consistency nogoods. □ 



Proposition 4. The minimum space an agent needs with DMAC-ABTl for ensuring 
maintenance of the highest degree of consistency achievable with DC is 0(v^{v + d)). 
With bound consistency, the required space is 

Proof. The agents need to maintain at most v levels, each of them dealing with maximum 
V variables, for each of them having at most v last consistency nogoods. Each consistency 
nogood refers at most v assignments in premise and stores at most d values in label. The 
stack of labels requires therefore 0{v^{v + d)). DMAC-ABTl also stores O(u^) tags for 
consistency nogoods. The other structures are identical as for DMAC-ABT. □ 
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Abstract. We show that existing constraint manipulation technology 
incorporated in the paradigm of symbolic model checking with rich as- 
sertional languages can be successfully applied to the ver- 

ification of client-server protocols with a finite but unbounded number 
of clients. Abstract interpretation is the mathematical bridge between 
protocol specifications and the constraint-based verification method on 
heterogeneous data used in the Action Language Verifier, a model checker 
for CTL IBYKUlj . The method we propose is incomplete but fully auto- 
matic and sound for safety and liveness properties. Sufficient conditions 
for termination of the resulting procedures can be derived by using the 
theory of [ACJT9^ . As a case-study, we apply the method to check safety 
and liveness properties for a formal model of Steve German’s directory- 
based consistency protocol IPRZOll . 



1 Introduction 

Formal verification of client-server protocols is an important and challenging 
problem. Client-server architectures are present at different levels of abstrac- 
tions in modern computer systems. Consistency protocols for client-server ar- 
chitectures are used, e.g., in multiprocessor systems with shared memory and 
local caches, distributed file systems, distributed database systems, and web- 
based applications to ensure the coherency of distributed data. An important 
class of consistency protocols makes use of central servers to serialize the access 
to the data. This kind of protocols are often validated on test sets with a fixed 
number of clients. In many interesting examples, however, it is not possible to 
fix an a priori bound on the number of clients requesting access to the data. 
This assumption makes the application of automated (push-button) verification 
methods like BDD-based symbolic model checking |McM93] , and state explo- 
ration |Hol88J problematic. State explosion limits de facto the applicability of 
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finite-state techniques like symbolic model checking to concurrent systems with 
a relatively small number of components. Thus, although useful for debugging, 
in general symbolic model checking cannot help us in automatically proving a 
protocol correct for any possible number of clients. 

In the last years many efforts have been spent in order to lift symbolic m odel 
checking from finite- to infinite-state applications. Following |KMM~*~97j . this 
goal can be achieved by employing rich assertional languages to reason about 
potentially infinite collections of system states. This idea finds a natural counter- 
part in the paradigm of constraint-based model checking^ see e.g. [BGP99IDP991 
iFriOO] . In this setting, the solutions of existentially quantified constraint formu- 
las are used as denotations of an infinite collection of system states. Algorithmic 
verification procedures for temporal formulas are then defined on top of existing 
constraint-solvers such as a Presburger arithmetic solver as in |BGP99j . and a 
real constraint solver as in |DP99l. 

In this paper we show that several verification problems of protocols designed 
for client-server architectures with a finite but potentially unbounded number of 
clients can be naturally solved using the composite- constraint approach proposed 
in |BGL00| . In this approach constraints over heterogeneous data are used as 
symbolic representation of states. The methodology we follow consists of the 
following steps. 

We first specify the server and a generic client using finite-state commu- 
nicating machines in the style of |BGBD1 lFN98lGS92ine]flfl] . In our model we 
allow synchronous and asynchronous communication mechanisms. Furthermore, 
we allow global variables with Boolean type. As main case-study, we present 
a formal model for the consistency protocol proposed by Steven German in 
[IGerOOlPR.ZOlJ . Many other examples can be modeled this way as shown, e.g., 
in [BGR,01IDel00IEFM99| . The verification of the safety properties studied in 
| |PRZ01| amounts to the following parameterized reachability problem: one has to 
show that for any number of clients unsafe states can never be reached. 

Following the methodology proposed in |DelOO| , we apply a counting abstrac- 
tion to reduce the family of communicating finite-state machines indexed on the 
number of clients to a transition system with Boolean and integer variables. In- 
tuitively, the counting abstraction maps a global state (whose size depends on 
the number of clients) into a finite tuple of Boolean and integer values, in which 
we keep track of the current server state, the value of the global variables, and 
the number of clients in every possible local state. A formal model based on 
communicating finite-state machines can be compiled automatically into an ab- 
stract protocol using a set of rules mapping protocol transitions into guarded 
commands defined over Boolean, and integer variables. Via this abstraction, ver- 
ification of safety properties can be reduced to a reachability problem in which 
initial and target states can be expressed as composite constraints, i.e., formulas 
over Boolean and integer variables. The Action Language Verifier jBYKOlj . a 
constraint-based CTL model checker, can then be used to attack this kind of 
verification problems. Action Language Verifier is built on top of the Composite 
Symbolic Library |YKTB0T] which provides operations to manipulate composite 



288 



G. Delzanno and T. Bultan 



constraints, by integrating a BDD library |CUDD] , and a Presburger arithmetic 
manipulator |Pug92IKMP'^ . 

Using the theory proposed in [ACJT96] , it is possible to prove the decid- 
ability of the resulting verification method for safety properties expressed via 
a special class of composite constraints in which the arithmetic part denotes 
upward closed sets of abstract states. Interestingly, the safety properties for the 
German’s protocol considered in | PRZ01| can be expressed using this class of 
composite constraints. 

As a practical result, we were able to automatically verify interesting safety 
properties like mutual exclusion for readers and writers for our case-study. Being 
a full-fledged model checker for temporal properties expressed in CTL, the Action 
Language Verifier also allowed us to automatically verify liveness properties. 
To our knowledge, this is the first time that constraint technology based on 
composite symbolic representations are used to verify formal models of client- 
server protocols for arbitrary number of clients. 

Plan of the paper. In Section El we will informally describe our case-study. 
In Section El we will show how to formally specify it. In Section [H we will 
introduce the counting abstraction. In Sections El El and 0 we will describe the 
tools we used to analyze the abstract protocol and the results we obtained. 
Finally, in Section 0 we will draw some conclusions and discuss related works. 



2 A Consistency Protocol for Multi-client Systems 

In this section we informally describe a directory-based consistency protocol for 
multi-client systems with sharing data (cache lines, memory pages, etc.) inspired 
by the protocol proposed by Steven German | |GerOO| presented in | |PRZ01| . The 
protocol is designed for a system consisting of a single home node and an ar- 
bitrary number of clients. The home node serializes requests for the data. A 
transaction begins when a client with null access rights sends a request either 
for shared or exclusive access to the home node. If the home node is not serving 
another request (it is idle)., it can pick up a new request from one of the clients. 
The home node maintains the set of sharers identifiers, and the list of sharers 
that have to be invalidated before serving a given request. Furthermore, it uses 
an internal Boolean flag, we will call ex, to indicate whether or not home granted 
exclusive access to the data. When the home node is granting exclusive access, or 
granting shared access and there is a client with exclusive access right, the home 
node must invalidate all clients. The home node sends out invalidate messages 
to one client at a time. When a client in state shared or exclusive receives an in- 
validate message, it downgrades its access rights, and sends an acknowledgment 
back to the home node. The home node removes the client from the list of sharers 
when it gets the invalidate acknowledgment. When all necessary invalidations 
have been done, the home node sends a reply message to the client who made the 
request. A reply is either a grant of shared access or a grant of exclusive access. 
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The client updates its access when it receives a grant message from the home 
node. The protocol should ensure the following two safety properties (the first 
one is considered also in | |PRZ01| ~): (PI) at most one process per time can obtain 
the exclusive access right; (P2) exclusive and shared access rights are mutually 
exclusive. The challenge here is to prove PI and P2 for any number of clients. 
For this purpose, we will first turn the informal specification into a formal one. 

3 Communicating Finite-State Machines 

The specification language we propose is obtained by merging the asynchronous 
CCS-like model of ( one monitor, and many clients with asynchronous 

communication), the broadcast protocols of [IEN98] (synchronous communica- 
tion), the model used to specify cache coherence protocols of |Del00j (syn- 
chronous communication, conditions over the global state), and the global/local 
machines proposed in | |BCR01| (asynchronous communication with global and 
local variables). 

Global machines. A global machine is a tuple {B,QstQ, S,S), where: B is the 
tuple of global Boolean variables; Qs is the finite set of states of the server; Q is 
the finite set of states of the local machines; and S is the set of synchronization 
labels used to build the set of possible actions As of a process. Specifically, let 
be a Boolean formula over B and B' (the primed version of the variables in 
B). Then, an action has one of the following form: 

— Internal action: t : for (. G E. 

— Rendez-vous: : ip (send) and ?£ (receive). 

— Broadcast: !!£ : ip (send) and ??£ (receive). 

The Boolean formula p is used to express pre-and post-conditions (using primed 
variables) on the global variables B. In the rest of the paper, we will use £ 
to indicate the action £ : true. We will clarify the semantics of actions in the 
next paragraphs. The behavior of the server and of the clients is described via 
the transition relation 5 : {Qs x As x Qs) U (Q x As x Q). In the following, 
we will use s s' to indicate that (s, a, s') G 5, and we will restrict <5 to 
be deterministic. In order to define an operational semantics we must fix the 
number of clients, say k, as shown next. A global state for k clients is a tuple 
G = (so,p,s), where sq G Qs (server state), p is an evaluation for the variables 
in B, and s = (si, . . . , Sfe) (local states) is such that st G Q for i : 1, . . . ,k- The 
execution of a protocol is formalized through the relation defined next. Let 

G = {so,p,s), s = (si,...,Sfc), G' = {s'o,p',s'), and s' = (s'i,...,4). Define 
i 

7 = pU p' . Then, G => G' provided one of the following conditions holds: 

— s.t. Si s', j{p) = true. 

— 3i,j s.t. Si s', 'y(p) = true, and Sj — ^ s'. 

— 3i s.t. Si s', 'y{p) = true, and Vj s.t. 5 is defined on 11£, Sj s'y 
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Fig. 1. Specification of the Home node. 



In all the previous cases we assume that: p'{h') = p{h) for any variable b G B 
such that 6' does not occur in the guard; and Si = s[ if the i-th client is not 
involved in the action. A run of a global machine is a sequence of global states 

g 

GqGi . . . such that Gi => Gi+i for i > 0. Go is the initial global state of the 
run. A global state G is reachable from G', written G G', if and only if there 
exists a run from G to G'. 



3.1 A Formal Model for the Consistency Protocol 

To specify our protocol, we use a Boolean variable ex (representing the flag home 
granted exelusive right), the machine for the home node described in Fig. |l] and 
the one for a generic client described in Fig. [21 Recall that the home node is 
supposed to serialize the requests and serve one client at a time. As in jCerOOl 
imTon , we consider message buffers of capacity at most one. Using synchronous 
communication, via the labels reqE and reqS we model the capability of the 
home node of storing the identifier of the client to be served: on reception of 
a request, home moves from Idle to one of the ‘busy’ states ServeE, ServeS. 
Differently from |Ger00IPRZ01| . instead of handling invalidation via two global 
variables storing the identifiers of clients to be invalidated we use broadeast 
communication as explained below. Let us assume that the home node has to 
serve a request for exclusive access. Since all sharers must be invalidated, the 
server sends an invalidation broadcast to all clients in state shared. All sharers 
react to the broadcast downgrading their access rights. After having invalidated 
all sharers, home checks the flag ex to see if it still needs to invalidate clients with 
exclusive access. If the flag is on, instead of using broadcast, home assumes that 
only one process can be in exclusive state, and sends him the invalidation message 
invE using synchronous communication. The same situation repeats when the 
home node has to serve a request for shared access and the flag ex is on. On 
reception of the invalidation message, the client with exclusive access downgrades 
it to null. The flag ex is set to false (using the post-condition ->ex') after the 
invalidation process in state ServeS and InvE. The flag ex is set to true (using 
the post-condition ex') after granting exclusive access. When the home node is 
in state ServeS and ex is off, the server immediately grants shared access to the 
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Fig. 2. Specification of a Client. 



requesting client. In addition to the rule specified in |Ger00IPRZ01| . we add the 
possibility for sharers to request an upgrade of their rights. This is accomplished 
via the transition Shared — > ReqE labeled with reqE in Fig.|^ The initial global 
state of the protocol with k clients is defined as (Idle, (false), s), where s is the 
vector (si, . . . , Sk) and Si = . . . = = Null. 



Verification of Safety Properties. Let G(k) denote a global state with k clients, 
and Bir(A:) be the set of unsafe global states with k clients w.r.t. a given safety 
property E. Then, we say that the abstract protocol is k-sa,ie if and only if there 
are no runs Go(k) G(k) such that G(k) G Mp(k). In order to prove that a 
protocol is safe for all possible system configurations, it is necessary then to 
prove that it is fc-safe for any fc > 1. According to |EN98| . we will call the reach- 
ability problem for arbitrary values of A: a parameterized reachability problem. 
In our example Bpi and Bp 2 can be characterized as the sets of global states 
G containing the following minimal violations: (PI) G contains one occurrence 
of shared and one of exclusive; (P2) G contains two occurrences of exclusive. In 
other words, as often happens with safety properties, the set of unsafe states is 
upward closed (if a global state with k processes contains a violation generated 
hy k' < k processes, then it is unsafe). Furthermore, note that the description 
of unsafe states is independent from the identifiers of individual clients. In fact, 
we are not interested in proving that process 2 and process 6 are not violating 
mutual exclusion, we want to prove it for any pair of clients! 



4 An Abstract Model 

When trying to check safety properties that can be expressed independently 
from individual identifiers, it is often very useful to apply the following counting 
abstraction. The idea is to define an abstract state consisting of: (1) a control 
part obtained by merging the Boolean variables and the server control location; 
(2) a collection of counters to keep track of the number of clients in each local 
state q G Q. Formally, let G = (s, p, s) be a global state. The abstract state is 
defined as: G^ = (s ,p,c), where c = (c\, . . . , Cn), and Cj = number of occurrences 
of qi in s for i : 1, ... ,n, and n = \Q\. When applied to the transition relation S, 
the counting abstraction returns the abstract protocol that can be formally 
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described as a transition system with Boolean and integer data paths. Formally, 
the abstract protocol consists of the control locations Qs, the Boolean variables 
B, and the non-negative integer variables x = (xi, . . . , x„); Xi represents the 
counter of the number of clients in state qi S Q. In the rest of the paper we will 
often use Xq to denote the counter associated to state q G Q. Abstract transitions 
are guarded commands s ^ s' : C, where s, s' G Qs, and C is a formula defined 
over the variables in ;B U S' U x U x' as follows. 

— The internal action s — ^ t is compiled into the formula A > 1 A cCg = 
Xs — 1 A Xj = Xt + 1. 

— The rendez-vous p q, r — ^ s (all states distinct each other) is compiled 
into the formula (pAXp>lAXr>lAXp = Xp— 1 A x^ = Xg + l A xj,= 
Xr — 1 A x's = Xs + 1. 

Wt.tp 

— Finally, consider the broadcast p q, Si — > s for i : 1 . . . m (all states 
distinct each other). Then, S is compiled into the formula A Xp > 1 A x), = 
Xp — 1 Ax'q = Xq-\-l A x(, = Xs + Xsi + . . . + Xs^ A x(,^ =0 A ... A x(,^ = 0. 

In all above cases additional constraints of the form x), = Xs and b' = b are 
implicitly assumed for all integer and Boolean variables that are not involved in 
any action. The transition system resulting from the application of the counting 
abstraction is a Vector Addition System with state (server location and Boolean 
variables), a model underlying the usual operational semantics of Petri Nets, 
extended with special transfer arcs associated to broadcast operations [EN981 
IEFM99IDel00J . Given two abstract states Cf = (s,p,c) and Cf = {s',p',c'), 
we say that Cf Cf if and only if there exists a transition s — > s' : C in 
such that C\pjB,p' jB' ,c.jyi,c! jyd\ = true. Given an abstract protocol and 
an initial state G^, an abstract run is a sequence G^, Gf , ... of abstract states 
such that Gf G|5|_]^ for i > 0. Then, we have the following proposition. 

Proposition 1. Let M be a global machine, and be the corresponding ab- 
stract protocol. Then, Go Gi if and only if Gf Gf for any Go, Gi. 

The abstract protocol for the example of Section |5]is described by the transitions 
of Fig. 0 defined over the control locations Idle, ServeS, ServeE, GrantE, 
InvE and Grants, the Boolean variables ex, and the integer variables xn for 
the client state Null, xwe for WaitE, xws for WaitS, xs for Shared, and xe 
for Exclusive. It is important to note that the representation of abstract global 
states is independent from the number of clients, and that it fully exploits the 
symmetries in their behavior. To check properties PI and P2, it remains now 
to describe our approach to attack the reachability problem coming out from 
Proposition [TJ 

5 Composite Symbolic Representation 



In order to analyze the behavior of a protocol for any possible number of clients, 
we need a finite representation for infinite collections of abstract states. One 
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(reqS) 


Idle — > Serves : 


(reqE) 


Idle — > ServeE : 


(reqE) 


Idle — > ServeE : 


(inv) 


Serves — > Grants 


(nonex) 


Serves — > Grants 


(invS) 


ServeE — > InvE : 


(invE) 


InvE — > GrantE : 


(nonex) 


InvE GrantE : 


(grants) 


Grants — > Idle : 


(grantE) 


GrantE Idle : 



xn > 1 Ax'f^ = xn — 1 A x'ws = xws + 1 

xn > 1 A x'ff = Xn — I A x'wE = xwe + 1 

Is > 1 A *5 = Xs — 1 A x'wE = Xwe + 1 

ex A -'Ci' A iB > 1 A = iB — 1 A x'n = xn + 1 

-■ex 

x'n = Xn xs a x'g = Q 

ex A -■ex' Axe > 1 A x'n = Xn + ^Ax'n = Xe — ^ 
-■ex 

Xws > 1 A x'ws = Xws - 1 A x's = xs + 1 
Xwe > 1 a x'l^B = Xwe — 1 Ax'e = xb + 1A ex' 



Fig. 3. Abstract client-server protocol. 



such representation could be obtained using linear constraints to encode sets 
tuples of integer and Boolean values. However, since manipulation of arithmetic 
constraints is expensive this strategy is not likely to scale. To solve this problem, 
we will use the composite constraints of |BGL00| as symbolic representation of 
infinite collections of global states. To explain this idea, let us first introduce a 
new set C of Boolean variables, which will be used to encode the control locations 
Qs of the server; if |Qs| = m, then we need log 2 m variables. In our setting, a 
composite constraint is a formula ipbooi A ifint , where (pbooi is a Boolean formula 
over the Boolean variables B U C, and ipint is a disjunction of linear arithmetic 
constraints over the variables x of the abstract protocol. The denotation of a 
composite constraint is defined as follows: 

Iv’booi A (fiintj = {(s, p, c) I ipbool is true in ps U p, and (pint is satisfied in c}, 

where pg is the evaluation of variables £ encoding location s € Q. Composite 
constraints allow us to finitely and compactly represent initial and unsafe states 
for parameterized verification problems that can be formulated independently 
from client identifiers. As an example, the initial configuration of our protocol is 
described as the composite constraint defined as pidie A ~^ex A xn > 1 A xs = 
0 A xe = 0 A Xwe = 0 A xws = 0> where (pidie is the Boolean formula over 
£ representing location Idle. Furthermore, the set of potential violations of the 
mutual exclusion properties PI and P2 can be represented as I>i V I >2 where 
^1 = a:s > 1 Axb > 1, and I >2 = xe > 2. Based on this observation, it follows 
that we can reduce the verification problem for M and properties PI and P2 
to the following reachability problem for For any G |^ol; there are no 
runs G* 4 G* of M* , such that G* G V I> 2 }. 

Based on this idea, we encode collections of abstract states of the protocol 
using composite symbolic representations which are disjunctions of composite 
constraints [BGLOOJ . Formally, a composite symbolic representation <1 is in the 
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form: 

^ — \J ^bool-i ^ ^inti 

i i 

where each ipbooii is a Boolean formula, and each ipmti is a disjunction (set) of 
linear arithmetic constraints as mentioned above. Each tpinti can be represented 
in a disjunctive form as (/?*„(, = \! ■ ^ where = /\f. Cijk and each djk is 

an atomic linear constraint. Operations on arithmetic and Boolean constraints 
can be used to implement a symbolic predecessor operator Pre that computes the 
effect of firing the transition of an abstract protocol backwards on a composite 
symbolic representation. We first note that we can represent a guarded command 
t of M* via the composite constraint (pt defined over the variables B,C,x and 
their primed versions {C and C are used to represent the old and 

the new control locations, respectively). Based on this observation, Pret(^) is 
defined as the existentially quantified formula (with variables in C,B,x) defined 
as follows: 

Pret{<P) = 3£'.3S'.3x'. pt{C,B,ic, C' , B' y) A {\J pbooui^' , B') A 

i 

Since existential quantification distributes over the disjunction we get 

Pret{<P) = \J Pret{<Pi). 

I 

By hypothesis. Boolean and arithmetic constraints have no variables in com- 
mon. Thus, the existential quantification also distributes over the conjunction 
to obtain 

Pretm = (3£'.3S'. ^;„„jA(3x'. 

where p'booi ^'int obtained collecting together, respectively, the Boolean 
and arithmetic constraints of ipt^Pbooii, and pinti- Furthermore, we can dis- 
tribute the existential quantification over the set of linear arithmetic constraints 
in ^'iuu = Vj V'inu, such that: 

Pre*(<Z>.) = (3£'.3S'. <pL„Ja(\/(3x'.^'„,^^.)) 

3 

Eliminating x' amounts to replacing every primed variable with its definition 
in (ft- Hence, if ipmu- is a set of linear constraints so is 3x'.(/?j„t^ . . The symbolic 
predecessor operator associated with is defined then as: 

Pre(<?) = \J Preti'P). 

tGM* 

The operator preserves our composite symbolic representation. Furthermore, it 
is easy to check that 



IPre(<l>)] = {G* I G* 



G*, G* e l-Pj}. 
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Symmetrically, it is possible to define a symbolic successor operator Post such 
that Post(<?) returns the set of abstract states reachable from (we omit its 
definition for brevity). 

Symbolic forward and backward exploration procedures can be implemented 
then using the Pre and Post operators. The symbolic forward exploration pro- 
cedure works on a composite symbolic representation Current. Given an initial 
set of composite constraints <Pq, we first set Current := <Pq. Then, we apply 
Post to all the constraints in Current to compute a new set of constraints 
New. If each composite constraint in New entails Current, we stop. Otherwise 
we add the composite constraints in New to Current and continue. Symbolic 
backward exploration can be implemented similarly by starting from the com- 
posite constraints representing unsafe states, and using the Pre operator at each 
step. 

In order to keep the number of disjuncts generated during a fixpoint com- 
putation small, it is possible to use simplification rules as the ones used in the 
Composite Symbolic Library described in the next section: for each composite 
constraint ^Pbooh^^inti h checks if ^Pbooii is satisfiable, and removes the constraint 
if it is not; it looks for composite constraints ^booh A (fiinti and ipbooi A ipint such 
that |v?booZi] = |<P6ooij], and merges them to form one composite constraint 
‘fbooh A {(pinti V p>intj)- Since the boolean part of the composite constraint allows 
efficient equivalence and satisfiability checks (as is the case for BDDs) these sim- 
plification operations can be implemented efficiently and applied after each step 
in the symbolic forward and backward exploration procedures. 

Interestingly, symbolic backward and forward exploration are not equivalent. 
As shown in [EFM99J . symbolic forward exploration (enriched with acceleration 
operators d la Karp and Miller |EN98| 1 may not terminate for transition systems 
associated to broadcast protocols (a subclass of global machines). On the other 
hand, symbolic backward exploration is always guaranteed to terminate when the 
seed of the exploration is a constraint representing upward-closed set of abstract 
states. We will discussed this point in the next section. 



Conditions for Termination. One interesting class of linear constraints that 
can be used to represent set of unsafe states with the special property of be- 
ing upward-closed is that of additive constraints considered in |I)EP99J . An 
additive constraint consists of a conjunction of atomic formulas of the form 
ai ■ yi -\- . . . Un ■ Vn > c, where Ui is a nonnegative integer constant, and yi is 
a variable ranging over non-negative integers. As shown in |DEP99] . the class 
of additive constraints equipped with the usual notion of entailment between 
linear constraints form a well-quasi ordering. This implies that there cannot be 
infinite chains of additive constraints whose elements are not comparable to each 
other with respect to the entailment relation. Composite additive constraints are 
obtained by restricting the linear arithmetic part of a composite constraint to 
be additive. Composite additive constraints are closed under application of the 
symbolic operator Pre associated to an abstract protocol. As a consequence, we 
have the following result. 
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Proposition 2. Let <P a composite constraint representation. Then, the sym- 
bolic backward exploration algorithm taking as seed of the computation termi- 
nates and computes a symbolic representation o/Pre*(|^]). 

Note that the composite constraints representing the unsafe states associated 
to property PI and P2 of Section Id. II are composite additive constraints. As a 
consequence, we have the following corollary. 

Corollary 1. The verification of properties PI and P2 for the protocol of Sec- 
tion no is decidable. 

In the next section we will discuss the practical issues related to our methodology. 

6 Tool Support for the Composite Constraint Method 

The Composite Symbolic Library uses an object-oriented design to combine dif- 
ferent assertional languages [YKTBOT] . An abstract interface defines the opera- 
tions used in symbolic verification: Boolean operations, equivalence and entail- 
ment tests, and image computations (for the Pre and Post operators). To define 
a new assertional language one simply has to implement the abstract interface 
with specialized operations. Currently, the Composite Symbolic Library provides 
two basic symbolic representations: BDDs via the Colorado University Decision 
Diagram Package (CU DD) IICUDD] , and linear integer arithmetic constraints 
via the Omega Library |KMP~*~9^ . Operations on composite symbolic represen- 
tation are implemented using corresponding operations on these basic symbolic 
representations |BGL00j . The object-oriented design of the Composite Symbolic 
Library makes it possible to write polymorphic verification procedures, i.e. verifi- 
cation procedures that dynamically select symbolic representations based on the 
input specification. The input language of the Composite Symbolic Library is 
called the Action Language |BulOO| . Action Language is a specification language 
for reactive software systems which supports both synchronous and asynchronous 
compositions and hierarchical specifications; currently, it supports Boolean, enu- 
merated, and integer types. In this setting, a specification consists of a set of 
modules and atomic actions. Modules can be defined by composing other mod- 
ules or actions using synchronous or asynchronous compositions. Atomic actions 
are defined using formulas on primed and unprimed variables as in our abstract 
protocol example. In action formulas only Boolean logic and linear arithmetic 
operators are allowed. Given an input specification, the Action Language Verifier 
jBYKOl] translates the input specification to a composite constraint represen- 
tation and checks the verification conditions by computing forward or backward 
fixpoints using the Composite Symbolic Library. Verification conditions are spec- 
ified in temporal logic CTL. 

In general, for the class of systems that can be specified in the Action Lan- 
guage CTL model checking is undecidable. To achieve convergence, one can use 
conservative approximation techniques. Such operations have been successfully 
used for the verification of infinite-state systems using linear-arithmetic con- 
straints (see e.g. |BGP99J L The Action Language Verifier extends these results 
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to composite symbolic representations. Specifically, it implements a generaliza- 
tion of the widening operation on convex polyhedra to compute upper-bounds 
for fixpoints which do not converge. It also uses truncated fixpoint computations 
to compute lower-bounds. Using both these techniques, it is possible to compute 
both lower and upper approximations for any CTL property. 

The Composite Symbolic Library and the Action Language Verifier are avail- 
able at the URL http://www.cs.ucsb.edu/~bultan/composite/. 



7 Experimental Results 

In our experiments we focused on two kinds of CTL formulas: safety properties 
expressing mutual exclusion and liveness properties expressing freedom from star- 
vation. In general, a safety property, expressed via the CTL formula AG{<P), holds 
whenever all reachable states belong to the set of safe states Clearly, it can be 
proved by contraposition, by showing that there are no reachable states that be- 
long the set of unsafe states in CTL this corresponds to the following equiv- 
alence AG{d>) = Furthermore, one can show that EF{fp) = Pre*{fp) 

for any E. This implies that AG-properties can be verified by first using sym- 
bolic backward reachability with seed to compute EE, and checking that the 
initial states are not in the resulting set of states. In CTL it is possible to ex- 
press more complicated formulas like the liveness property AG{'Pi — > AF{<!> 2 )). 
This formula can be read as follows: if holds at state s, then must even- 
tually hold in all executions starting at s. Liveness properties can be checked 
algorithmically via nested greatest and least fixpoint computations. Verification 
of liveness for Vector Addition Systems with transfers arcs (or with test for zero 
in the guards) is undecidable [FFM99] . However, constraint-based model check- 
ers can still be used as incomplete verification procedures using heuristics and 
approximation techniques to enforce termination [HCP99j . 

The table in Fig. 0 summarizes the practical results we obtained via the 
Action Language Verifier. We performed our experiments on two different mod- 
els of the client-server protocol described in Section The model ‘B’ of Fig. 
E]is the abstract protocol of Fig. E] The model ‘F is a refinement of model ‘B’ 
in which the atomic invalidation broadcast is replaced by the invalidation loop 
formulated at the abstract level as shown in Fig. El Note that this formulation 
needs guards with tests for zero. Tests for zero break the decidability of the ver- 
ification of safety properties jDiOO], i.e., approximations might be necessary in 
order to verify AG- formulas for the model ‘F. For both models, we considered 
the CTL properties listed in Fig. El The parameters of the experimental evalua- 
tion were the following: ‘UA’ denotes the use of approximations in the fixpoint 
computations; ‘UF’ denotes the use of approximate forward state exploration 
(see explanation below); ‘Strategy’ denotes the strategy used to check the prop- 
erties, namely the sequence of steps (/=forward exploration, £'F'=least fixpoint, 
AG=greatest fixpoint) annotated with the number of iterations needed for each 
of them, e.g., EF{A) means that the least fixpoint is reached in four iterations; 
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Property 


Model 


UA 


UF 


Strategy 


Memory (Mbytes) 


Time (secs.) 


PI - 2 


B 






PP(4) 


10.2 


0.60 


PI - 2 


B 






f(7), PP(1) 


9.7 


0.52 


P3 


B 






PG(3), PP(5) 


14.1 


2.37 


P3 


B 




■n/ 


f(7), PG(3), PP(1) 


10.7 


0.68 


P4 


B 






PG(3), PP(8) 


25.6 


9.34 


P4 


B 






f(7), PG(3), PP(1) 


11.2 


0.74 


P5 


B 


V 




PG(4), PP(ll) 


14.1 


3.01 


P5 


B 




V 


f(7), PG(3), PP(1) 


10.4 


0.61 


PI - 2 


I 






PP(4) 


10.4 


0.59 


PI - 2 


I 




■n/ 


f(6), PP(1) 


9.8 


0.50 


P3 


I 






PG(3), PP(5) 


11.9 


2.01 


P3 


I 






f(6), PG(3), PP(1) 


10.6 


0.65 


P4 


I 




■n/ 


f(6), PG(3), PP(1) 


11.6 


0.81 



Fig. 4. Experimental results obtained on a SUN ULTRA 10 workstation with 768 
Mbytes of main memory, running SunOS 5.7. 

ServeE — > GrantE : xs = 0 A xe = 0 

ServeE — > ServeE : xg > 1 A = xw + 1 A 2:3 = xs — 1 

ServeE — > ServeE : xe > Ax'j^ = xn + 1 A a:^ = a;_B — 1 A ^ex' 

Fig. 5. The invalidation loop in state ServeE. 



Property 


Property specification in CTL 


PI - 2 


-^EFi^ixs ^ 1 A Xe ^1) V Xe ^ 2) 


P3 


AG(xws >1 ^ AE{xs > 1)) 


P4 


AG{xws >N^ AF{xs > N)), N > 1 


P5 


AG{xwe ^ AF(xe ^ 1)) 



Fig. 6. Specification of the properties for our case-study. 



‘Memory’ and ‘Time’ denote the total resource consumption for the application 
of the corresponding strategy. 

Some explanations for Fig.lHare in order. Let us start from the model ‘B’. As 
expected, we verified the safety properties PI and P2 of Section[21 i.e., the CTL 
formula PI — 2 of Fig. [S] without need of any approximation. We also verified the 
liveness properties P3 and P4 without using approximations. Since our method 
works on abstract models in which we forget identifiers of clients, the liveness 
property P4 must be read as if more than k clients are waiting, then at least k 
clients will get the desired access, i.e., a sort of freedom from global deadlocks for 
the original concrete protocol. Fixpoint computations for the liveness property 
P5 does not converge without approximation techniques. However, we were able 
to prove the property using truncated fixpoint computations and widening. 



Constraint-Based Verification of Client-Server Protocols 



299 



We also investigated the use of an a priori forward exploration of the abstract 
protocol reachable states (indicated as ‘UF’ in Fig.|4j. Specifically, using widen- 
ing techniques we first computed an over-approximation of the set of reachable 
states, and then used it to restrict the search-space during backward reachability. 
As an example, for model ‘B’ the approximate forward exploration (indicated 
as ‘f’ in Fig. 0 allowed us to verify all the properties faster, e.g., PI — 2 in one 
iteration instead of four. By caching of the approximated reachable set, it should 
be possible to further improve the execution times of Fig. [H 

Let us consider now the model with invalidation loop. Again, to verify PI — 2 
we needed no approximations (however, note that termination is not guaranteed 
in this case). For property P3, the innermost fixpoint converged in three itera- 
tions, whereas the outermost fixpoint diverged without approximations. Using 
approximation techniques both fixpoints converged and we were able to ver- 
ify the property. One interesting result is that we were able to verify property 
P3 without using approximations in the backward fixpoints when we combined 
them with the approximate forward exploration. The approximate forward ex- 
ploration for model ‘F converges in six iterations and using it we can verify P3 
more efficiently. For property P4, as with P3, the innermost fixpoint converged 
in three iterations, whereas the outermost fixpoint diverged without approxima- 
tions. However, when we used approximation techniques although the fixpoint 
computations converged the results were not strong enough to verify the prop- 
erty. When we used approximate forward exploration backward fixpoint compu- 
tations converged and we were able to verify the property. When we tried the to 
verify property P5 for model ‘F, inner fixpoint computation did not converge. 
When we used approximations the results were not strong enough to verify the 
property. Even when we used approximate forward exploration, the results did 
not change. Hence, we were not able to verify or falsify the property P5 for 
model ‘F. 

8 Conclusions and Related Works 

In this paper we have shown that existing constraint technology can be suc- 
cessfully applied to the formal verification of protocols parametric on the num- 
ber of participants. Abstract interpretation works as a bridge between protocol 
specifications and models that can be handled via constraint-based verification 
methods working on heterogeneous data like the Action Language Verifier of 
[IBYKOIJ . The counting abstraction has been introduced in jGS92J . where fami- 
lies of asynchronous CCS processes were verified via a reduction to Petri Nets. 
In [DEP99IDel00| . a similar abstraction has been applied to the verification of 
cache coherence protocols (but not to directory-based as the protocol of [CerOOp . 
and concurrent systems specified as broadcast protocols |EN98j . The specifica- 
tion in | DEP99IDel00| allows synchronous communication but it does not admits 
heterogeneous data like global Boolean variables. 

In |BCR0Ij . the counting abstraction has been applied for the verification 
of skeletons of multi-threaded libraries. The resulting abstract models are ba- 
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sically Petri Nets with state. The authors analyze them using the Karp-Miller 
coverability construction, i.e., forward exploration with accelerations |EN98j . 
However, this procedure is not guaranteed to terminate in presence of broadcast 
communication | EFM99| . 

In |PRZ01| . an alternative method based on deductive verification has been 
used to verify safety properties of parameterized systems like the German’s pro- 
tocol we considered in this paper. The method of [PRZOlj uses heuristics to 
discover invariants for parameterized systems, and to verify that the discovered 
invariants are inductive. The method is incomplete, but fully automatic (it is 
based on HDDs) and sound for safety properties. Differently from the previously 
mentioned approaches, constraint-based tools like the Action Language Verifier 
represents an incomplete but fully automatic sound tool for checking full CTL 
formulas. We exploited this feature to automatically verify new safety (property 
P2 has not been studied in [PP.ZOIJ ) and liveness properties for our case-study. 
On the other hand, the specification language used in [PR/01 J allows one to 
associate complex data structures, e.g. arrays storing process identifiers, to in- 
dividual processes. Extending our approach in order to handle parameterized 
system with this kind of data structures seems an interesting direction of future 
research. 
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